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Abstract 

The  dissertation  examines  various  methods  of  nonparametric  density  estima¬ 
tion,  and  nonparametric  kernel  estimation  in  more  detail.  The  consequences  of 
various  kernel  window  width  and  their  effect  on  the  mean  integrated  square  error 
are  examined  using  Monte  Carlo  techniques. 

The  mean  and  the  variance  of  nonparametric  density  estimator  is  derived  for 
symmetric  kernels  with  finite  mean  and  finite  variance.  The  results  also  treat  kernels 
with  varying  window  parameters. 

The  nonparametric  kernel  estimate  was  used  to  obtain  new  estimators  for  the 
three  parameter  Weibull  distribution  using  distance  estimation  arid  the  Cramer- 
von-Mises  statistic.  Comparison  with  maximum  likelihood  estimators  using  a  Monte 
Carlo  sample  of  size  1000  and  various  different  parameters  showed  a  significant  im¬ 
provement  over  the  maximum  likelihood  estimators  in  the  mean  integrated  square 
error  between  the  estimated  distribution  and  the  true  distribution. 

Several  new  goodness  of  fit  tests  are  proposed  using  the  nonparametric  kernel 
estimator  and  the  Cramer-von-Mises  and  the  Anderson  Darling  statistics.  Extensive 
Monte  Carlo  experiments  were  performed  to  obtain  the  critical  values  for  the  test 
and  to  study  the  power  of  the  tests  against  eight  alternative  distributions.  The 
tests  using  the  Anderson  Darling  statistic  showed  greater  power  against  almost  all 
alternative  distributions  studied  than  the  K.S.  test. 
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A  new  non  parametric  kernel  estimator  was  introduced  by  varying  the  window 
width  in  each  tail  portion  of  the  sample.  The  method  permitted  different  window 
width  in  each  tail  portion  and  in  the  center  portion  of  the  sample.  The  method  uses 
separately  the  sample  percentile  ratios  as  a  measure  of  each  tail  length.  The  kernel 
parameter  for  the  tail  sample  values  is  chosen  using  sample  percentile  ratios  for  that 
tail.  1  he  nonparametric  kernel  estimator  results  in  comparable  mean  integrated 
errors  with  the  estimators  developed  earlier. 


APPLICATIONS  OF  NON-PARAMETRIC  DENSITY 

ESTIMATION 


/.  Introduction 

The  idea  of  using  nonparametric  density  estimation  is  a  rich  research  topic, 
both  in  estimation  techniques  and  in  applications.  Two  previous  dissertations  under 
the  supervision  of  Prof.  A.  H.  Moore  studied  density  estimators  with  applications 
(Sweeder,  1982  and  Fuchs,  1984)  . 

A  continuation  of  the  previous  research,  with  the  idea  of  exploring  some  new 
applications  of  the  nonparametric  density  estimation,  using  different  nonparametric 
density  estimators,  is  the  goal  for  this  research. 

This  dissertation  is  divided  into  six  main  parts  (  chapter  II- VII).  The  first 
part  surveys  some  of  the  known  nonparametric  density  estimation  methods  with  the 
aim  of  looking  at  the  different  results  and  deciding  which  of  these  methods  meets 
the  need  for  a  nonparametric  density  estimation  technique  with  the  least  number 
of  parameters  and  the  most  established  theoretical  results.  Among  these  methods 
are  the  orthogonal  series  method,  the  penalty  functions  method,  the  delta  sequence' 
method,  and  the  nearest  neighbor  method.  This  part  is  briefly  concluded  with  a 
descriptive  comparison  from  the  literature  of  these  methods. 
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Next,  the  kernel  method  which  has  (1)  only  one  parameter,  (2)  the  best  under¬ 
stood  properties,  (3)  the  invariance  property  with  respect  to  both  location  and  scale, 
and  (4)  is  computationally  effective,  is  discussed  in  chapter  III.  Since  the  choice  ol  the 
kernel  is  not  as  crucial  as  the  choice  of  the  parameter  (window  width  h)  in  the  kernel 
method,  the  Gaussian  kernel  is  chosen  which  has  an  infinite  support  and  solves  the 
problem  of  finding  the  estimated  density  support  when  using  a  kernel  with  a  finite 
support.  In  this  chapter  it  is  also  shown  that  for  the  kernel  method,  the  mean  of  the 
nonparametric  density  is  the  sample  mean,  and  the  variance  of  the  nonparamct  t  ic 
density  is  the  sample  variance  plus  the  kernel  variance.  Since  for  certain  applica¬ 
tions  the  invariance  of  the  density  estimator  is  required,  the  invariance  property  for 
the  kernel  estimator  is  also  shown.  A  suggested  h  is  then  introduced,  based  on  the 
approximate  optimal  choice  of  the  window  width.  A  Monte  Carlo  experiment  is 
designed  with  this  proposed  choice  of  h.  The  mean  integrated  square  error  (MIS In ) 
is  used  as  a  measure  for  the  closeness  of  the  true  density  to  the  estimated  one.  The 
results  from  different  distributions  are  reported  for  sample  sizes  10(10)60. 

In  chapter  IV  a  numerical  optimal  choice  of  h  is  derived  in  the  form  of  a 
constant  multiple  of  the  unbiased  estimator  of  the  standard  deviation  divided  by  the 
fifth  root  of  the  sample  size.  The  different  values  of  the  constant  of  multiplication 
together  with  the  corresponding  h  and  MISE  are  reported  for  various  distributions 
and  a  given  sample  size. 

Chapter  V  and  VI  consider  parameter  estimation  for  the  three  parameter 
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Weibull  distribution.  In  chapter  V  the  log-likelihood  equations  are  solved  numer¬ 
ically  using  the  hyrbrid  method.  Chapter  VI  considers  the  use  of  the  minimum 
distance  estimation  technique  to  estimate  the  parameters  of  the  three  parameter 
Weibull  distribution  using  Cramer  von  Mises  statistic  as  a  measure  for  the  closeness 
of  the  density  function  with  parameters  obtained  by  the  maximum  likelihood  method 
and  the  density  function  with  parameters  obtained  by  the  new  minimum  distance 
estimation  method.  Results  from  a  Monte  Carlo  of  size  1000  are  reported  for  both 
methods.  The  results  demonstrate  an  improvement  of  the  new  estimation  technique 
over  the  maximum  likelihood  technique. 

In  chapter  VII  a  new  modified  goodness  of  fit  technique  for  normality  is  in¬ 
troduced.  The  critical  values  for  the  test  are  generated.  The  power  of  the  test  for 
various  alternative  distributions  is  computed. 

Chapter  VIII  introduces  an  adaptive  density  estimation  based  on  the  choice  of 
different  h  for  each  tail  of  the  distribution.  The  sample  percentile  ratios  are  used  as 
a  criterion  for  the  choice  of  h  in  the  tail  values  of  the  sample. 
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II.  Survey  Of  Some  Nonparametric  Density  Estimation 


Methods 


Introduction 

A  large  number  of  methods  of  nonparametric  density  estimation  have  been 
proposed.  These  methods  have  the  common  goal  of  estimating  a  density  function 
when  a  set  of  data  is  given.  A  few  types  of  estimates  were  first  proposed  in  Fix  and 
Hodges  (1951).  Although  the  nonparametric  estimates  involve  some  parameters, 
they  are  still  considered  nonparametric  in  the  sense  of  relaxing  the  assumptions 
about  the  distribution  of  the  observed  data. 

Many  different  methods  of  density  estimation  have  been  introduced  and  studied 
for  a  long  time.  Monte  Carlo  comparisons  have  been  done  for  various  nonparatnel  l  ie 
estimators.  A  discussion  of  some  properties  and  basic  results  of  the  following  uon- 
parametric  density  estimation  methods:  orthogonal  series  method,  the  penalty  func¬ 
tion  method,  the  delta  sequence  method,  and  the  nearest  neighborhood  method  will 
be  surveyed  and  discussed  in  this  chapter.  The  survey  and  discussion  in  this  chapter 
follow  essentially  the  discussion  by  Paraska  (1983:27-173).  The  kernel  method  will 
be  treated  and  studied  in  a  separate  chapter  with  a  Monte  Carlo  experiment  lor 
different  sample  sizes  from  various  distributions  since  it  is  the  method  that  will  be 
used  for  the  different  applications  in  the  dissertation  together  with  the  reasons  for 
choosing  this  method. 
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In  the  orthogonal  series  method,  the  density  function  is  expressed  in  terms  of 
its  orthogonal  series  expansion  and  by  estimating  the  coefficients  in  the  orthogonal 
expansion  the  estimate  of  the  density  can  be  found. 

In  the  method  of  penalty  functions,  an  estimator  is  obtained  through  optimiz¬ 
ing  (  maximizing  )  the  likelihood  function  of  the  sample  over  densities  such  that  the 
likelihood  function  has  a  finite  maximum  when  the  underlying  density  belongs  to 
the  class  of  density  functions. 

The  delta  sequence  method  is  in  fact  a  generalization  of  other  methods  such 
as  Fourier  inversion  and  the  kernel  method. 

The  nearest  neighborhood  method  is  based  on  fixing  a  constant  r  and  choosing 
the  rth  ordered  distance  of  all  the  observations  from  a  given  point,  then  using  this 
distance  as  a  smoothing  parameter. 

The  method  of  kernels  is  a  widely  used  method  in  applications  with  the  best 
understood  properties.  It  came  from  the  idea  of  the  naive  estimator,  which  is  an 
evolution  of  the  histogram  as  will  be  discussed  later  in  chapter  III. 

The  performance  of  the  different  methods  have  been  reported  in  the  literature 
and  the  methods  studied,  hence  in  the  next  part  a  summary  of  each  of  these  methods 


is  presented. 


The  Method  Of  Orthogonal  Series 

The  idea  of  this  method  is  to  express  the  density  function  in  terms  of  its  or¬ 
thogonal  series  expansion  and  to  find  the  estimate  of  the  underlying  density  through 
estimating  the  coefficients  in  the  orthogonal  expansion. 

This  method  was  first  introduced  by  Cencov  in  a  1962  paper.  To  show  the 
conditions  under  which  it  is  possible  to  expand  a  function  f  in  terms  of  a  set  of  a 
complete  orthonormal  basis,  let  us  assume  that  A'  is  a  space,  M- is  a  a-  algebra  of 
subsets  of  A"  i.e 


0  €  M  , 

(1) 

E  6  M  =>  A’\E  G  M  , 

(2) 

Ej  €  M  =>  UjEj  €  M 

(3) 

where  Uj  is  the  union  over  j 

Henc<=(A ’,M)  will  be  a  measurable  space. Let  n  :  M  — >  [0,  oo]  be  a  measure 
and  the  norm  space  L2(fi)  is  separable,  i.e 

M0)  =  O,  (4) 

C  M=^  /i(uEj)  =  5>(Ej)  (5) 

where  E}  are  disjoint. 
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If  V  represents  the  family  of  probability  measures  on  (A',A4)  such  that  the 
Radon  -  Nikodym  derivative dp/ dp  G  L2(p)  Vp  G  V,  let  25  =  {b;,i  >  1}  be  a .complete 
orthonormal  basis  for  L2(p).  Since  B  is  complete,  then  /  =  dp/dfi  can  be  written 
as: 

OO 

fix)  =  52aMx)  (bj 

1=1 

where 


a.  =  J  f(x)bi(x)dn(x) 


=  Efbi(x) 


(') 


which  will  correspondingly  introduce  the  orthogonal  series  estimator  of  f  based  on  a 
random  sample  of  size  n  to  be  defined  as: 


Ln 

/(*)  =  J2 

i=i 


6,(x) 


(8) 


where  is  a  fixed  version  of  S,  and  the  number  of  terms  in  the  expansion  Ln  — +  oo 
as  n  — *  oo  and  where  a,-  is  replaced  by  its  estimator 


at 


Ej[b,(x)) 

r=l 


(9) 
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The  properties  of  /  are  studied  in  Bosq  (1970)  and  can  be  briefly  summarized 
in  the  following  points:- 


1.  MISE  — >  0 


lim  /  1/n 
Jx 


L«n 

!>?(*) 

Li=0 


f(x)  d/r(x)  =  0 


where  MISE  is  the  mean  integrated  square  error  defined  as: 


MISE  =  E I [f{x)  -f{x))2dx 


10) 


2.  If 

(i)  f  is  continuous. 

(ii)  B *  =  {b’ ,  i  >  1}  are  continuous  and 


Mn  =  sup  sup  |bi*(x)|  <  oo  ,  n  >  1 
l<l<Z.n  xex 


(hi)  ESi  aib'{x)  unl^Tly  f(x)  ,  and 


(iv)  limn-^oo  M*{L2n/n)  =  0 


then 


lim  sup  E[\f{x)  -  f{x) |2]  =  0 


8 


3.  If 


(i)  B’  is  uniformly  bounded 


v~>co  w  >1  uniformly  ,,  . 

00  £ Zi*A(x)  — »  f{x) 


(iii)  3  m  >  0  3 


fx  b?(x)  f(x)  d/<(x)  >  m,  i  >  1 


(iv)  limn-.oc,  Ln  =  oo  and 


(v)  £~=i  Lnexp{-Xn/Ll)  <  oo,  VA  >0 


then 


d ’  =  sup|/(i)  —  / ( a- )  1  — ♦  0  as  7i  — ♦  oo 


4.  If 

(i)  bi  is  of  bounded  variation  for  all  i  >  1 

(ii)  E“,  aMx)  f(x) 

(iii)  lim*^  Ln  =  +oo 

Ov)  £~  i  exp{-an/Mln  V£j  <  oo  ,  Va  >  0 
with 

Mn  =  sup^KH  supl6A- |bj(x)|  , 

K  =  £"=1/t»l 

then 
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oo 


13) 


sup  |f(x)  —  f ( x ) |  *  0  asn  — ♦ 

x€-V 

The  necessary  and  sufficient  conditions  for  the  convergence  of  the  density  es¬ 
timator  using  the  method  of  orthogonal  functions  are  given  by  Bosq  and  Bleuez 
(IhTG).  Finally  the  advantages  and  disadvantages  of  the  method  will  be  stated  in 
the  discussion  section  at  the  end  of  this  chapter. 

The  Method  Of  Penalty  Functions 

This  method  is  characterized  by  applying  the  known  methodology  of  estimation 
.the  maximum  likelihood  method  ,  originally  introduced  by  R.  A.  Fisher  .  which  is 
considered  as  a  universal  method  for  optimal  estimation. 

The  problem  statement  in  this  case  is  to  find  an  estimate  of  the  underlying 
density  function  from  which  a  sample  of  size  n  was  drawn  such  that  the  likelihood 
function  is  maximized. This  is  mathematically  formulated  as: 

Max  l( f|xj, . x„ )  =  Ilf(Xi)  (FI) 

i=i 

where  X] . ,xn  are  i.i.d  random  variables  with  a  common  unknown  density  f  and 

/  is  the  likelihood  function  of  the  sample  .This  likelihood  function  does  not  have  a 
finite  maximum  when  f  belongs  to  the  class  of  density  functions  T  .  This  makes  ii 
necessary  to  set  restrictions  on  J-  to  avoid  that  infinite  solution. 
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An  approach  for  using  the  maximum  likelihood  principle  is  by  penalizing  t  hose 
functions  giving  an  infinite  solution.  This  infinite  solution  will  essentially  happen  if 
T  is  a  sequence  of  functions  that  converges  pointwise  to  a  Dirac  -  delta  function.  This 
means  that  the  penalization  wouid  represent  a  way  of  deciding  between  smoothness 
and  goodness  of  fit. 

Now.  define  a  penalty  function  V  \T  >— *  7Z  as  a  real- valued  functional  over  IF: 

n 

also  define  L(f)  —  log  l  =  ]T)  logf(xj)  as  Hie  likelihood  function  and  define 


LP  :  /  t— >  L  —  aP  ,  a  >  0  (15) 

as  the  logarithm  of  the  penalized  likelihood  function. 

Hence,  the  problem  will  be  to  find  a  measurable  function  f  :  7Zn  i —*  IF  3  LP  is 
maximized  .  This  F  is  called  the  maximum  penalized  likelihood  estimator  of  f. 

A  suggested  penalty  function  (Good  and  Gaskin  1971  )  has  the  form: 

+oo 

P(f)  =  J  [f'(x)/f(x)j  dx  (10) 

—  OG 

and  the  problem  is  formulated  as: 

MaxLP(f)  =  L(f)  -  atP(f)  (17) 
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subject  to  : 


+  CO 

J  f[x)dx  =  1  (18) 

— oo 

fix)  >  0  (K>) 

f(x,)  >  0  ,V?  =  (20) 


and 


P{f)  <  oo 


(21) 


To  avoid  the  non-negativity  constraint  Good  and  Gaskin  used  the  substit  ution 
i=g 2  which  transforms  the  problem  to: 


subject  to 


n  +oc' 

MaxLP(f)  =  2Y^l°g\g\(xt)  -  4a  f  g‘2[x)dx 


(22) 


J  g2(x)dx  =  1 

“OO 

+oo 

J  g2{x)dx 


<  oo 


(23) 

(24) 


and 


1 9  l(x»)  >0  V  i  =  l,..,n 


bio 


12 


The  estimator  obtained  by  this  method  is  a  spline  function  with  double  expo¬ 
nential  splines  and  knots  at  the  sample  points. 

An  optimal  solution  for  this  problem  which  is  twice  differentiable  with  the 
same  sign  for  all  x  was  derived  by  Ghorai  (1977  ). 

The  Method  Of  Delta  Sequence 

This  method  generalizes  other  different  methods  such  as  Fourier  inversion 
method,  Kernel  method,  Histograms  and  others.  To  define  a  delta  sequence  lei 
$  be  an  element  of  the  class  of  continuous  functions  with  continuous  derivatives  of 
all  orders  i.e  $  6  C°°  with  support  I=(a,b),  a,b  6  R-  ,for  every  x  G  1.  A  =  |(5,(.r,t)| 
is  a  delta  sequence  on  I  if: 

<5,  :  /  — >  /  3  6 ,  is  bounded  measurable  V  i=  1,...  and 


lim  /  <5,(x,t)$(f)  dt  =  3>(x)  (26) 

i— oo  Ji 

An  estimator  based  on  that  method  and  an  i.i.d  sample  xj . xn  from  f(x) 

would  have  the  form: 

f(x)  =  (27) 

n;=i 


which  gives  a  sequence  of  estimators  when  using  the  sequence  A.  This  estimator 
can  give  other  kinds  of  estimators  like  the  ones  mentioned  above  by  a  proper  choice 
of  the  delta  sequence.  The  necessary  and  sufficient  conditions  for  the  asymptotic 
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unbiasedness  for  some  delta  sequence  based  estimators  are  given  by  Walter  and  Blum 
( 1 9 7 G  ),  while  the  asymptotic  normality  of  such  estimators  is  studied  by  Watson  and 
Leadbetter  (1964). 

The  Nearest  Neighbor  Method 

This  method  is  based  on  the  choice  of  a  fixed  constant  r,  and  through  ordering 
the  distance  of  each  of  the  n  observations  from  a  given  point  one  will  be  able  to  pick 
the  rth  ordered  distance.  The  mathematical  formulation  for  this  method  comes  from 
the  idea  that  the  number  of  observations  in  an  interval  of  width  2 ivr  centered  at  .?• 
is  exactly  r-l.This  implies  that: 


r  —  1  =  2i vrnf(t) 


(28) 


which  means  that  : 


/(0  = 


r  —  1 
2  wTn 


(29) 


which  gives  the  estimate  of  /(<)  based  on  the  rth  nearest  neighbor.  The  method  does 
not  give  an  estimated  density  that  integrates  to  one.  The  estimator  for  this  method 
has  discontinuous  derivative  at  the  points 


Discussion  Of  Different  Methods 

This  section  surveyed  some  results  about  the  kernel  method,  orthogonal  series 
method,  the  method  of  penalty  functions  and  the  nearest  neighbor  method. 
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It  is  well  known  that  in  order  to  have  a  successful  use  of  the  non-parametric 
density  estimation  techniques  there  should  be  a  sufficient  amount  of  data  and  a 
reasonable  information  about  the  form  of  the  underlying  density  function.  A  Monte 
Carlo  study  to  compare  density  estimators  of  both  the  kernel  method  and  the  met  hod 
of  orthogonal  series  for  specific  distributions  ( normal ,  uniform  ....  etc)  is  performed 
by  Kumar  and  Markmann  (1975). 

In  kernel  estimation  one  must  choose  the  kernel  and  the  window  width.  The 
choice  of  the  kernel  does  not  significantly  affect  the  efficiency  of  the  estimator,  how¬ 
ever  the  window  choice  varies  both  the  bias  and  the  variance  of  the  estimator  of  f(.r) 
at  each  value  of  x.  Since  the  underlying  density  is  not  known,  this  means  that,  there 
will  be  no  guarantee  that  the  choice  of  the  window  is  the  optimal  one.  However  the 
kernel  method  gives  an  estimator  which  is  a  density  when  choosing  the  kernel  as  a 
density,  besides  being  computationally  efficient. 

In  orthogonal  series  estimation  one  has  to  choose  a  basis  and  some  cut  of] 
sequence.  The  choice  of  basis  will  affect  the  mean  integrated  square  error.  The 
disadvantage  of  this  method  is  that  the  basis  is  arbitrarily  chosen  independent  of  the 
given  data,  and  the  it  does  not  give  estimates  which  are  densities.  Furthermore,  the 
estimators  could  be  negative.  However  it  is  more  efficient  computationally  than  the 
kernel  method  since  few  terms  give  a  sufficiently  accurate  estimator.  A  cosine-based 
estimate  has  been  suggested  by  Anderson  (1969)  to  have  good  characteristics. 

In  the  method  of  penalty  function,  there  is  some  complexity  involved  in  the 
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calculations  of  the  estimator,  however  using  a  discrete  maximum  penalized  likelihood 
it  becomes  less  complex.  An  advantage  of  the  method  is  the  insurance  of  the  non¬ 
negativity  of  the  estimator  since  the  penalty  function  is  a  function  of  the  logarithm 
of  the  density. 

The  nearest  neighbor  method  was  developed  to  find  a  commputationally  fast 
technique  for  estimating  the  density.  Contrary  to  the  kernel  method,  this  method 
over-smooths  the  distribution  tails.  Also,  the  estimates  in  this  case  are  not  every 
where  differentiable  and  in  general  it  does  yield  an  estimator  that  integrates  to  unity. 

However,  after  examining  the  methods  discussed  above  in  detail,  the  kernel 
method  is  chosen  to  be  used  for  the  applications  studied  in  this  reseach  due  to  its 
following  properties: 

(1)  It  is  Scale  and  location  invariant  if  one  chooses  the  parameter  to  be  scale 
invariant. 

(2)  It  gives  a  proper  density  function  when  the  kernel  is  a  density  function. 

(3)  It  does  not  give  a  negative  estimator. 

(4)  It  has  only  one  parameter. 

(5)  It  directly  picks  the  support. 

(6)  It  is  fast  in  computations. 
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III.  Monte  Carlo  Comparison  For  Some  Distributions  Using 


The  Kernel  Method 


Introduction 

The  histogram  as  a  basic  model  for  density  estimation,  and  the  naive  estimator 
are  introduced  in  this  chapter.  The  kernel  method  is  then  surveyed  as  being  a 
natural  evolution  of  the  naive  estimator.  Some  basic  properties  and  results  for  the 
kernel  estimator,  together  with  some  different  kernels  are  introduced.  The  mean  and 
variance  of  the  kernel  density  are  derived.  The  invariance  property  for  the  kernel 
method  is  shown.  Finally  a  Monte  Carlo  experiment  is  designed  to  examine  the 
behavior  of  a  set  of  different  distributions  under  a  proposed  choice  for  the  window 
width  of  the  kernel  estimator.  The  experiment  uses  different  distributions  with  the 
mean  integrated  squared  error  as  the  criteria  for  the  comparison. 

The  Histogram 

The  histogram,  if  it  is  constructed  so  it  integrates  to  one,  is  simply  an  esti¬ 
mate  of  the  p.d.f  as  a  function  which  varies  based  on  a  predetermined  division  of 
the  support  of  the  estimator.  It  also  is  expressed  as  a  function  of  the  number  of 
observations  from  a  sample  of  size  n  (Ai,  A2,  •  •  • ,  A„);  in  each  of  the  subdivisions  or 
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mesh  of  the  support  in  the  following  way: 


f(x)  =  —-(#ofXiinthesamebinasx) 
nh 


m 


where  h  represents  the  width  of  each  mesh  or  bin  and  known  as  bin  width. 

The  bin  width  h  can  be  allowed  to  vary  in  which  case  the  form  lor  the  estimator 
will  be: 

#ofXt  in  the  same  bin  as  x 
width  of  bin  containing  x 

The  basic  properties  for  this  estimator  can  be  summarized  as: 

-  Simple  and  easy  way  of  data  representation. 

-  It  depends  on  the  choice  of  origin  and  bin  width. 

-  In  bivariate  and  trivariate  samples,  it  depends  on  the  grid  direction  of  the 
cells  (besides  origin  and  bin  width) 


The  Naive  Estimator 

Since  f(x)  can  be  expressed  as  the  limit  of  the  rate  of  change  of  F(x)  then 


/(*) 


F(x  +  At)-F(x) 

lim - 

Atio  At 

F(x  -f  At)  —  F(x  —  At) 

lim - — - 

auo  2  At 


(32) 
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Hence,  it  is  reasonable  to  estimate  f(x)  by  f(x)  as 


Fn{x  +  Ax)  -  Fn{x  -  Ax) 

2  Ax 

where 

no.  of  X'.s  <  x  , 

- - ! — = —  (3-1) 

n 

now,  using  the  conventional  notation  for  the  bin  width  as  hn  instead  of  A.r  which 
varies  with  the  sample  size  n  then 


Fn{x) 


/>) 


-^—[no.  of  X[s  e  (x  -  hn,x  +  hn)  /  2] 


(35) 


where  hn  — >  0  as  n  — >  oo 

This  estimator  is  known  as  the  naive  estimator.  The  naive  estimator  can  be 
considered  as  a  histogram  with  each  observation  as  a  center  of  a  sampling  interval. 

This  method  gives  a  discontinuous  estimator  with  jumps  at  X,  ±  hn  and  with 
zero  derivatives  everywhere  else. 


The  Kernel  Method 

In  this  section,  a  more  detailed  discussion  of  the  kernel  estimators  with  their 
properties  is  considered.  The  naive  estimator  involves  the  idea  of  looking  for  a 
function  through  which  one  is  able  to  obtain  a  measure  for  the  count  of  the  number 
of  \V«  ’n  the  interval  (x  -  hn  ,  x  +  hn).  Such  function  is  known  as  kernel  function 
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K(.)  satisfying  the  regularity  conditions:- 


(i)  sup  K(x)  <  M  <  oo,  |  x  |  K(x)— >  0  as  |t|  — >  oo. 

+°° 

(ii)  K(x)  is  symmetric,  f  x2I\(x)dx  <  oo. 

-OO 

(iii)  K(x)  has  an  absolutely  integrable  characteristic  function, 
and  the  estimator,  suggested  in  this  case  will  have  the  form:- 


/(*) 


ni 


T.K 

i=i 


(30) 


where  hn  — >  0  as  n  — »  oo. 

The  previous  discussion  gives  a  brief  introduction  to  the  concept.  This  concept 
can  be  summarized,  in  the  case  of  the  univariate  spaces  with  continuous  variables, 
as  placing  a  kernel  at  each  point  of  the  design  sample  {AT, ....,  A’n}-  Averaging  the 
contributions  of  the  different  kernels  at  all  points  of  the  support  results  in  the  kernel 
estimator. 

In  spite  of  the  fact  that  the  kernel  estimator  resolves  the  major  difficulties  with 
the  histograms.  Such  difficulties  are  the  fixed  cell  structure,  the  discontinuities  at 
cell  boundaries,  the  lack  of  tails,  and  the  exponential  increase  of  the  number  of  cells 
with  the  increase  of  the  number  of  variables.  The  kernel  estimator  has  the  problem 
of  the  choice  of  the  proper  hn.  It  is  obvious  that  for  a  fixed  n,  a  large  hn  gives  a 
very  smooth  estimate,  and  a  small  one  gives  an  irregular  estimate.  It  is  noted  that 
as  hn  — *  0  the  nonparametric  density  converges  to  a  series  of  spikes  at  each  of  the 
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observations.  This  means  that  a  difficulty  corresponding  to  the  choice  of  the  cell  size 
in  the  histogram  will  remain. 

The  mathematical  properties  for  the  univariate  kernel  estimators  are  well 
known.  These  include  the  bias  and  the  asymptotic  results. 


The  asymptotic  properties  of  such  an  estimator  are  investigated  by  Parzen 
(1962)  .  The  necessary  and  sufficient  conditions  for  the  uniform  consistency  with 
probability  one  for  kernel  estimators  are  studied  by  Nadaraja  (1965)  and  Schuster 
(1970).  Based  on  their  study  for  the  properties  of  the  kernel  estimator  the  following 
theorem  holds:- 

CO 

1.  For  a  kernel  function  Iv(.)  which  is  of  bounded  variation  and  £  exp(—jjhn  ) 

j= i 


converges  V7  >  0. 
Then 


S  =  supx\f(x )  —  f(x) | — »  0  with  probability  1  as  n  — »  00  f  is  uniformly 


continuous. 


Now,  several  results  are  introduced  on  the  consistency  of  kernel  estimator. 
First  define: 

J  =  / 1/  -  /I 

then  the  following  results  hold: 

1.  If  the  kernel  is  Borel  measurable  function  on  7Zn  9  :  I\  >0  ,  / 1<  =  1 

then 
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, . ,  .in  probability  _  c  C 

(l)  J  — *  0  as  n  — >  oo  tor  some  1. 

(;i)  J  — *  0  as  n  — >  oo  ,V  f. 

.....  ,  almost  surely  ,  w  f 

(in)  J  — ►  0  as  n  - — >  oo  ,  V  i. 

,.  .  exponentially  w  f 

(iv)  J  — - ►  0  as  n  — >  oo  ,V  1. 

where  the  exponential  convergence  means  :  given  e  >  0  ,3  r  ,  ?io  >  0  3 
P(  J  >  e )  <  exp(-rn)  ,  n  >  no. 

(v)  lim„_oo  hn  =  0  ,  lim„_co  n(h.n)m  =  oo. 

2.  For  any  density  f  on  7lm 

;  K  is  an  absolutely  integrable  function  3  :  /  I\  =  1 
,  limn_oo  ~  0  i  limn_oo  ^  (^n)  — 


then 


T  exponentially  „  , 

J  — >  0  as  n  -+  oo  ,  V  1, 


3.  If  Iv,  f  are  densities  on  7vm  ;  J  ,n  prS^tl,ty  0  as  n  — *  oo 


then 


limn^oc  /i„  =  0  ,  limn^oo  n  (hn)"1  =  oo. 


4.  supz  |/(x)  —  E[f(x)] |  0  as  n  — >  oo  ,  V  distributions  F. 


5.  Let  B(x)  =  E[/(x)]  -  f(x)  be  the  bias  of  the  estimator,  and 


+oo 


—  OO 
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where  K(x)  satisfies  : 


(i)  sup;rK(a’)  <  M  <  oo  ;  |x|K(x)  — »  C  as  |x|  — »  oo. 

+  0O 

(ii)  I\(x)  =  K(-x)  ,  x  €  71  ;  J  x2  K(x)dx<  oo. 

—  OO 

If  f  is  a  bounded  density  function  and  if  f  "(x)  exists,  then 

B(l)  =  “j  "*>/”(*) 

The  choice  of  the  smoothing  parameter  h  is  more  crucial  than  the  choice  of  the 
kernel  itself.  The  approximate  MISE  as  a  function  of  h  is  given  by  : 

i  J  K2{i)dt  +  ml  j  J  {f"{x)h2}2  dx  (37) 

which  upon  differentiation  w.r.t  h  and  equating  to  zero  will  give  the  optimal  hopl. 

hopl  =  m?2/5  |  J  /\2(<)df|  jy  /"(x)2dx|  ?r_1/5  (-58) 

The  h  value  gets  bigger  as  the  second  derivative  of  f(x)  gets  smaller  and  consequently 
this  gives  a  smoother  estimator  and  a  smaller  approximate  MISE. 

An  approach  for  the  kernel  choice  that  uses  calculus  of  variation  to  derive  a 
kernel  that  optimizes  the  approximate  MISE  gives  a  kernel  with  an  efficiency  1  which 
is  known  as  Epanechnikov  kernel.  This  kernel  is  given  as: 
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A»  = 


—  t)  ~ <  x  <  \/b 
0  otherwise 


Defining  the  efficiency  of  a  kernel  as  the  ratio  of  its  MISE  relative  to  that  of 
Epanechnikov,  the  relative  efficiency  of  different  kernels  are  given  in  Table  1. 


Table  1.  Different  Kernels  with  their  Efficiency 


kernel 

K(x) 

Efficiency 

Epanechnikov 

nn 

1.00 

Boxed 

vf  ,  _  f  \  if  — 1  <  x  <  1 
\  0  otherwise 

0.9295 

Bi  weight 

0.9939 

Gaussian 

K(x)  =  7^exP~  (f ) 

0.9512 

The  idea  of  chosing  a  smoothing  parameter  value  that  will  subjectively  agree 
with  a  priori  information  about  the  underlying  distribution  is  valuable  in  terms  of 
specific  applications,  even  if  it  seems  not  to  be  a  nonparametric  approach.  In  other 
words  the  choice  of  the  smoothing  parameter  in  some  application  can  be  made  In- 
making  use  of  the  information  known  or  at  least  assumed  about  the  distribution 
form. 

Different  approaches  have  been  proposed  to  find  a  reasonable  choice  of  the  h 
parameter.  Among  those  methods  are  the  least  squares  cross  validation,  the  likeli¬ 
hood  cross  validation  and  the  test  graph  method. 
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A  simulation  of  a  comparative  study  of  some  of  the  kernel  methods  for  sample 
sizes  25,  50,  and  100  for  different  distributions  of  varying  tail  length  is  presented  by 
Bowman  in  his  1980  paper. 


Mean  and  Variance  Of  The  Estimator 
Theorem 

Let  K ( x )  be  a  symmetric  kernel  with  mean  Ek(x)  and  variance  \  k(x)  such  that 
Eic(. r),  \ '\-(x)  <  oo  and  /  K(x)dx  =  1.  If  f(x)  is  a  nonparametric  kernel  estimator 
based  on  a  sample  of  size  n  (Ad,  •  •  •  ,Xn)  with  K(x)  as  a  kernel,  then 

f(x)  has  a  mean  x  and  a  variance  Vk{x)  +  s2.  where  x  is  the  sample  mean  and 
s2  is  the  sample  variance. 

proof 

The  kernel  estimator  based  on  the  sample  (Ad,  •  •  ■ .  Xn)  is: 


1  di, 

/m  =  sEa- 

i=i 


(89) 


Hence  the  expected  value  of  the  random  variable  x  with  f(x)  as  a  density  function 
will  be: 


E(x) 


dx 


dx 
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1  " 
1 £ 


as  for  the  variance  vve  have: 


E{x-x)2 


=  E(x2)-x2 

-  \tj* 


dx  —  x2 


= 

t=l  n 

=  Vk(x)  +  s2 


Corollaries 


1.  The  mean  and  the  variance  for  a  kernel  estimator  with  a  Gaussian  ker¬ 
nel  asymptotically  approaches  the  mean  and  variance  of  the  empirical  distribution 
function.  This  can  be  showm  in  the  following  way: 


Since  in  the  case  of  a  Gaussian  kernel  with  f(x)  given  as: 


26 


the  expected  value  and  the  variance  will  be: 


E(x)  ---  x 

V{x)  =  h2  +  s 2  (43) 

and  since  h — »  0  as  n  — ►  oo  then  E(x)=i  and  V(x)  — ►  s2  which  are  the  mean  and 
the  variance  for  the  empirical  distribution  function. 

2.  For  different  kernels  K,(x)  with  variances  Vt,  i=l,...,n;  each  used  at  one  of 
the  sample  points  X\,  ■  •  ■ ,  Xn  respectively  the  mean  remains  the  sample  mean  while 
the  variance  will  be: 

n*)  =  i;v;-(s)+s2  (44) 

»=i 

The  kernel  estimator  is  location  and  scale  invariant  and  this  property  is  derived  in 
the  following  section: 

Invariance  Property  Of  The  Kernel  Method 

The  invariance  property  for  the  kernel  method  is  shown  in  this  section  under 
two  transformations.  First,  the  location  transformation  where  all  the  observations 
are  moved  either  to  the  left  or  to  the  right.  Second,  the  scale  transformation  where 
all  the  observations  are  either  compressed  or  expanded  by  a  constant  factor. 
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a)Tlie  transformation 


Z- 

•—'1 


A'i  -  C 


(•15) 


/(*) 


x~Xj\ 
h  ) 


(46) 


thus 


$(*)  = 


ifK/x  +  c-xj 
nh% 


=  /(*  +  C) 


(47) 


b)The  transformation 


2;  =  A'i/k 


(48) 


since  the  h  value  is  a  linear  function  of  the  sample  standard  deviation,  hence  the 
new  value  h  value  resulting  from  the  transformation  of  the  data  by  a  scale  k  in  the 
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above  way  will  be  h  such  that: 


where  A'i,  A'2,  •  -  - ,  A'n  are  independent,  identically  distributed  observations  from  f, 
and  I\(.)  is  a  function  that  satisfies  the  regularity  conditions  stated  on  page  19. 

The  choice  of  the  parameter  h  in  the  kernel  method  is  critical,  since  this  pa¬ 
rameter  controls  the  smoothness  of  the  resulting  estimator. 

The  choice  of  the  h  parameter  for  the  univariate  case  can  frequently  be  chosen 
visually  in  a  satisfactory  manner  (Wahba,  1983).  However,  the  need  for  a  predeter¬ 
mined  choice  of  the  h  parameter  in  most  of  the  applications  of  the  nonparamet ri< 
density  estimation  suggests  the  idea  of  examining  the  behavior  of  some  of  the  dif¬ 
ferent  distributions  under  a  proposed  choice  of  h.  A  Monte  Carlo  experiment  of  size 
1000  is  used  to  examine  the  behavior  of  the  estimators  for  six  different  distributions. 
These  distributions  are: 

-Uniform. 

-Exponential. 

-Cauchy. 

-Double  Exponential. 

-Logistic. 

-Normal. 

The  criteria  chosen  for  the  comparison  is  the  mean  integrated  square  error. 

The  optimum  choice  for  h  is  shown  in  equation  (38)  to  be  a  constant  times 
n~l .  Furthermore,  Silverman  (1986)  shows  that  the  optimum  h  for  the  normal  is 


1 .06(772 ~ *  using  a  normal  kernel.  Therefore,  a  data  dependent  h  equals  to  sit ~  s , 
where  s  represents  the  sample  standard  deviation  for  sample  size  n,  is  chosen  for  the 
Monte  Carlo  experiment  study.  It  also  gives  a  scale  invariant  nonparametric  density 
estimate  since  s  is  a  scale  incariant  estimator  of  a. 

Now,  the  data  based  choice  of  the  h  was  used  with  the  kernel  technique  when 
a  Gaussian  kernel  is  utilized  in  which  case  the  estimator  will  take  the  form: 


/(*)  =  -T^ 

n  h  fr{ 


x  -  X., 


(52) 


where  4>(x)  represents  the  p.d.f  for  the  standard  normal  distribution.  Sample  sizes 
10,20,. ..,60  i.e  10(10)60  are  used  and  MISE  defined  as: 


MISE  =  J  E[f(x)  -  f{x)]2  dx 

=  E  J  [/( x)  -  f(x)]  dx  (53) 

where  f(x)  denotes  the  nonparametric  estimator  based  on  the  previous  choice  of  li. 
while  f(x)  will  be  one  of  the  mentioned  six  distributions. 

To  evaluate  the  performance  of  the  method  over  the  various  distributions,  the 
Monte  Carlo  experiment  is  designed  the  same  way  for  all  the  six  distributions  and 
the  different  sample  sizes. 

The  methodology  is  such  that  a  certain  sample  size  10(10)60  of  each  of  the  dis¬ 
tributions  is  generated  using  the  IMSL  routines  RNUN,  RNEXP,  RNCAU,  RNNOR 
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for  the  uniform,  exponential,  caucliy  and  normal  distributions  respectively.  While 
an  inverse  C.D.F  technique  is  used  for  the  double  exponential  and  the  logistic  dis¬ 
tributions.  The  data  based  choice  of  the  smoothing  parameter  is  then  calculated  for 
each  of  the  1000  different  samples.  The  integrated  square  error  1SE  given  as: 


ISE  =  J  f(x)  -  f(x)  dx 


(54) 


is  then  computed  for  each  sample  using  the  IMSL  integration  routine  QDAG1  with 
bounds  — oc  and  oo.  This  is  only  modified  to  be  (-50  ,  +50)  for  the  logistic  dis¬ 
tribution  to  avoid  the  numerical  difficulty  of  computation  beyond  this  limits.  An 
estimate  of  MISE  is  then  obtained  by  averaging  the  ISE  from  the  1000  Monte  Carlo 
repetitions.  Likewise,  an  estimate  of  the  standard  deviation  of  MISE  is  computed. 
The  results  of  the  Monte  Carlo  experiment  for  the  different  sample  sizes  are  given 
in  Table2  where  the  table  entries  give  the  MISE  for  different  sample  sizes  with  the 
standard  deviation  in  brackets. 

The  results  of  the  Monte  Carlo  show  that  the  choice  of  h  which  is  near  optimal 
for  the  normal  ( hopt  for  the  normal  is  1.06<7n-^)  gives  a  comparable  results  for  the 
double  exponential  and  the  logistic  distributions,  while  a  reasonable  fit  was  found 
for  the  Cauchy.  A  relatively  large  MISE  is  obtained  for  the  uniform  and  exponential 
distributions  which  indicates  that  the  choice  for  these  distributions  is  not  as  optimal. 
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Table  2.  Values  of  MISE  for  different  distributions  with  standard  deviation  based 


on  M.C  size  1000  and  sample  of  size  n  for  each  repetition 


n 

Uniform 

Expon. 

Cauchy 

D.E 

Logistic 

Nor  m  ill 

10 

0.19142 
( 0.13260 ) 

0.15297 

(0.05^75) 

0.06277 

(0.03842) 

0.04318 

(0.02920) 

0.06412 

(0.06444) 

0.03004 
(0.03561 ) 

20 

0.12664 

(0.05745) 

0.12950 

(0.03629) 

0.06834 
( 0.04147 ) 

0.02672 

(0.01526) 

0.04487 

(0. 03402) 

0.01970 
( 0.01617 ) 

30 

0.10690 

(0.03691) 

0.11982 

(0.02800) 

0.06991 

(0.04038) 

0.02140 
(0. 01146) 

0.04131 

(0.02593) 

0.01107 

(0.01042) 

40 

0.09509 

(0.03028) 

0.11237 

(0.02478) 

0.07368 

(0.04011) 

0.01835 

(0.00067) 

0.03989 
(0.0223 4) 

0.01151 

(0.00843) 

50 

0.08800 
( 0.02429 ) 

0.10575 

(0.02220) 

0.07727 

(0.03953) 

0.01651 

(0.00S25) 

0.03916 
( 0.01934 ) 

0.00996 
(0.00704 ) 

60 

0.08267 

(0.02017) 

0.10191 

(0.01889) 

0.07998 

(0.03921) 

0.01503 

(0.00731) 

0.03904 
( 0.01763 ) 

0.00883 

(0.00635) 

# 
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IV.  Optimal  Choice  Of  The  Smoothing  Parameter 


Introduction 

The  choice  of  the  h  parameter  is  the  most  essential  step  in  successful  non- 
pa  rametric  density  estimation  using  the  kernel  method.  This  choice  is  theoretically 
derived  based  on  the  optimization  of  the  approximated  MISE  defined  in  chapter  III. 
In  this  chapter  a  Monte  Carlo  experiment  is  performed  to  approximate  the  optimal 
choice  of  the  h  parameter  for  the  Gaussian  kernel.  This  choice  is  a  crucial  one  in 
terms  of  a  goodness  of  fit  application  besides  any  other  applications  that  require 
a  nonparametric  estimate  of  the  density.  The  different  distributions  considered  are 
the: 

-  Uniform 

-  Exponential 

-  Cauchy 

-  Double  Exponential 

-  Logistic 

-  Normal 

These  distributions  represent  different  shapes  and  characteristics. 

Hence,  the  purpose  of  this  chapter  is  to  find  an  optimal  h  and  the  corresponding 
MISE  for  a  1000  different  samples  each  of  size  20  from  the  above  distributions. 
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Methodology 


The  optimal  h  w.r.t  minimizing  the  approximate  MISE  is  given  as: 

hopt  =  m-2/ 5  {/ K2{l)dt^lh  f"{x)2dx |  1/5  n"1/5  (55) 

where  m2  is  the  kernel  second  moment  (see  Parzen,  1962). 

This  approximate  optimal  value  hopt  is  derived  for  the  different  distributions 
as  a  first  step: 

1.  Uniform  distribution 

For  the  uniform  distribution  the  approximate  expression  gives  a  zero  h  since 
the  density  is  constant.  This  case  corresponds  to  the  E.D.F  estimator.  However  the 
M.C  results  indicate  that  this  value  does  differ  from  zero. 

2.  Exponential  distribution 

For  the  one  parameter  exponential  distribution  with  variance  V(x)  =  -j$  and 
p.d.f  given  as: 


f(x)  =  aeax  ,q>0  ,  x>0  (56) 

f'(x)  =  -a2  e~QX  (57) 

f"(x)  =  a3e~ax  (58) 
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77?  2  =  1  for  the  Gaussian  kernel 


where  a  is  the  standard  deviation  of  the  distribution.  Hence  by  substituting  in  the 
formula  for  the  approximate  optimal  h,  the  corresponding  h  for  this  distribution  will 


be: 


h-ovt  =  -8918<7  n  ? 


(Gi) 


3.  Cauchy  distribution 

For  the  Cauchy  distribution  with  a  density  f(x),  the  optimal  h  is  derived  below: 


/(*) 

/"(*) 


1 


(x  —  a)2  +  1 


—  oo  <  x  <  oo,  — oo  <  a  <  co 


3ir  ( x  —  a)2  —  1 


7r2  (x  —  a)2  +  1 


(G2) 

(63) 
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/5  7\  / 3  9\  / 1  IT 

1  [177.6995  37.6995  (y  -  J  +  25  (y,  y 


=  .1992 

hopt  —  1.0721  n~i 

where  B  is  a  beta  function. 

4.  Double  Exponential  distribution 

For  the  double  exponential  density  given  by: 


(64) 

(65) 
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/(*)  = 


e  0 


,0>O 


W 


(00) 


and  similar  to  the  previous  case  the  optimal  approximate  h  is: 


hopt  =  .7244cm  5 


(07) 


5.  Logistic  distribution 


The  logistic  density  function  in  two  parameters  is  given  by: 


f(x)  =  e.rp[—  (x  —  a)  /b]  /[&(!+  exp[—  (a:  —  a)  /  6] ) 2  (08) 


with 

E(x)  =  a  ,  V(x)  =  M  ,  mode(x)  =  a 


/'(*) 


1 

b  c 

62  (l  +  e^)4 

x— a  /„  x— a \ 2 

e  6  (l  +  e  *  )  -2 

(l+e*?*)  .^(*5*) 

izza  , 

e  b  | 

62 

(l+ef*) 

e  i-  1 

(»- 

^1  +  e 
“) 

‘  J 

3 

1 

b2  (l  +  e  »>*  j 


(69) 
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1  £jz£ 

l£  b 

(l  —  2eIfca )  62  (l  +  e* >>  )  —  3 b2  (l  +  e 

i-n\2  .  j-a  i-a  /  j-<i  \ 

6  )  6C  "  e  6  l1  “  6  6  j 

e  »  I 

^(l+c4?1)6 

;i-2££e)(l+e£fa)-3c!(i?)(l-e 

63  (l  +  e~  j 

j  — a  | 

e  6 

1  -  -  2e2(^)  -  3e^  +  3e2(^)' 

b 3  +  e~b~ 'j 

T  —  (1 

e 

'1_4e^+e2(^)l 

— 

J 

(70) 

p(\ 

Now.  let.  y  =  exp( ^t2- )  and  hence 

/OO 

f"(x)dx  = 

-OO 


Thus,  the  optimal  h  for  the  Gaussian  kernel  is  given  as: 


hopt  =  1.6396  b  n  * 

=  —  (1.6396) 

7 r 

=  0.9039  a  n-*  (72) 


f°°  y2{  1  -  4y  +  y2)2  ^ 

Vo  66(l  +  y)8  y  ^ 

- (TT^>5 - 4 

i  [5(2, 6)  -  85(3, 5)  +  185(4,4)  -  85(5. 3)  +  5(6, 2)] 

0s 

^  [25(2,6)  -165(3,5)  +  185(4,4)] 

4  (-02381)  (71) 
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wiierecr  is  the  distribution  standard  deviation. 


6.  Normal  distribution 

Silverman  (1986)  shows,  as  an  example,  that  the  value  of  h  based  on  the 
previous  approximation  is: 


hopt  =  1.06a  ?i  * 


(?:?) 


The  h  value  obtained  is  summarized  in  the  following  table  (Table3.). 
Table  3.  h  Values  for  Different  Distributions 


Distribution 

h 

Exponential 

.8918  on~i 

Cauchy 

1.0721  n~t 

Double  Exponential 

.7244  an~^ 

Logistic 

.9039  an~'i 

Uniform 

0.0 

Normal 

1.06  on~ ? 

A  numerical  improvement  of  the  previous  recommended  h  is  to  be  found  based 
on  the  use  of  an  unbiased  estimator  for  the  standard  deviation  and  the  use  of  a  linear 
search  around  the  previous  h  value  for  an  h  with  smaller  MISE.  The  general  form  of 
the  proposed  estimator  for  h  is  a  constant  multiple  of  the  unbiased  estimator  of  a 
times  .  Let  d  represents  the  unbiased  estimaior  of  a.  which  is  given  as: 


a  = 


.Vn£i¥) 
'  v^r(t) 


(74) 
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where  T  is  the  gamma  function  and  a  is  given  by: 


g  = 


\t(X>  ~  *)' 

1  =  1 


(75) 


thus 


G  — 


N 

v^r(i) 


(Tfi) 


The  Monte  Carlo  experiment  here  is  designed  for  a  sample  size  20  in  which 
case  the  optimal  h  is  assumed  to  be  hopt  =  kcnx~^ .  The  experiment  starts  with 
generating  1000  samples  from  the  6  distributions  (uniform,  exponential,  Cauchy, 
double  exponential,  logistic,  and  normal)  each  of  size  20.  Defining  an  interval  hi 
such  that: 

Ti  —  {hl\hopl  —  l  <  hi  <  hopl  +  u}  (77) 

where  hopt  is  as  defined  in  the  above  table,  a  search  in  the  closed  interval  hi  for  an  hi 
that  minimizes  the  MISE  is  performed.  The  search  starts  by  subdividing  the  interval 
into  a  mesh  of  m  equal  subintervals.  Computing  the  MISE  at  each  of  (m+1)  end 
points  of  the  subintervals  gives  an  array  of  MISE’s.  The  minimum  MISE  corresponds 
to  an  optimal  hi  value  in  H.  If  the  optimal  hi  lies  on  either  end  of  the  interval  7 H 
then  the  search  interval  is  expanded  by  1  or  u  for  lower  or  upper  end  points  of  H 
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respectively  and  the  search  continues. 


Upon  finding  the  optimal  hi,  as  described  above,  a  constant  k  is  computed 


as: 


_  i_  \ 

crn  4 

which  defines  the  factor  that  relates  the  choice  of  the  hi  to  the  unbiased  estimator 
ol  the  standard  deviation  and  the  sample  size.  The  following  table  gives  the  average 
optimal  hi,  the  average  k,  and  the  average  MISE  over  the  1000  different  samples 
with  their  standard  deviations  in  brackets. 


Table  4.  Optimal  h  for  sample  sizes  20  for  Different  Distributions 


Distribution 

#  of  samples 

^1  opt 

k 

MISE 

Uniform 

698 

.1629 

(.0515) 

1.0589 

(■4443) 

.1116 

(.0394) 

Exponential 

163 

.2719 

(.0915) 

.5334 

(.1902) 

.0865 

(.0304) 

Cauchy 

163 

7.1591 

(12.8467) 

.9657 

(.0800) 

.0675 

(.04  07) 

Double  Exponential 

600 

.5922 

(.1141) 

.8376 

(.2723) 

.0216 

(.0140) 

Logistic 

347 

.7862 

(.0899) 

1.4821 

(.1142) 

.0211 

(.0164) 

Normal 

1000 

.6160 

(.1008) 

1.1789 

(.3190) 

.0145 

(.0124) 
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The  method  shows  an  improvement  over  the  choice  of  h  as  sn  s.  The  percent¬ 
age  improvement  for  each  distribution  is  given  in  the  following  table: 

Table  5.  Percentage  improvement  in  MISE  relative  to  choice  of  h  as  for  dif¬ 

ferent  distributions 


Distribution 

%  improvement 

Uniform 

11 

Exponential 

33.2 

Cauchy 

1 

Double  Exponential 

19 

Logistic 

52 

Normal 

26.4 

This  shows  that  the  choice  of  sn  &  is  rather  good  one  over  the  set  of  distribu¬ 
tions  studied. 

The  following  graphs  show  examples  from  uniform,  normal,  exponential,  and 
logistic  distributions  using  the  constant  k  given  in  table  4. 
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x 

Figure  5.  A  nonparametric  p.d.f.  for  the  Cauchy  distribution  with  sample  size  GO 
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X 


Figure  8.  A  nonparametric  c.d.f.  for  the  double  exponential  distribution  with  sam¬ 
ple  size  60 


Next,  an  example  from  each  distribution  is  given  for  a  sample  of  size  GO.  The 
h  parameter  used  is  h  =  ksn~^.  The  seed  used  for  the  uniform  distribution  is  the 
same  for  the  other  distributions.  The  value  of  the  ISE  for  the  uniform  distribution 
is  .0514,  for  the  Cauchy  is  .0G32,  for  the  double  exponential  is  .0076,  for  the  logistic 
is  .0064,  and  for  the  normal  is  .0012. 

The  uniform  distribution  fit  shows  an  almost  linear  behavior  for  the  C.D.E. 
in  the  interval  [,1,.9],  however  due  to  the  infinite  support  of  the  Gaussian  kernel, 
the  support  of  the  estimated  density  is  [-.4,1.4],  This  is  a  typical  behavior  for  such 
estimator  and  remedial  measures  can  be  taken  to  handle  such  a  case,  however  Un¬ 
objective  is  to  get  a  tool  that  can  be  used  in  Monte  Carlo  of  relatively  large  size 
where  it  is  not  possible  to  visually  examine  each  case  separately.  The  exponential 
distribution  example  shows  that  the  estimated  density  support  is  close  to  the  real 
support.  It  also  indicates  that  the  estimated  density  is  not  quite  smooth.  The 
behvior  can  be  improved  by  a  larger  choice  of  the  h  parameter,  however  this  causes 
the  ISE  to  be  larger.  The  bump  near  x=5.5  is  due  to  the  existence  of  at  least  an 
observation  near  the  upper  tail  portion  of  the  support.  For  the  Cauchy  distribution 
a  constant  multiple  of  n“s  is  used  as  pointed  in  the  Monte  Carlo  experiment.  Both 
tails  are  rough,  however  the  middel  portion  of  the  distribution  is  reasonalbely  close. 
The  double  exponential  case  gives  a  fairly  close  fit  except  at  the  lower  tail  of  the 
distribution.  The  logistic  distribution,  for  this  case,  does  not  give  as  good  fit  as  for 
the  normal.  The  normal  distribution  example  shows  a  good  fit  at  both  tails  with  the 
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noparametric  distribution  skewed  to  the  right.  This  indicates  that  the  observations 
from  the  sample  selected  are  not  quite  symmetric  about  the  true  mean  and  hence  it 
shows  how  the  nonparametric  distribution  follows  the  sample  behavior. 
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V.  Parameter  Estimation 


Introduction 

Parameter  estimation  for  the  three  parameter  Welbull  distribution  is  discussed 
in  this  chapter.  While  the  problem  has  been  handled  in  different  ways,  the  method 
used  here  is  based  on  the  numerical  solution  of  the  log-likelihood  equations  using  t  he 
hybrid  method.  The  hybrid  method  is  an  iterative  method.  The  method  is  surveyed 
and  the  stopping  rule  is  stated.  The  results  from  this  chapter  will  be  compared  with 
the  results  from  the  next  chapter. 

Maximum  Likelihood  Estimation  For  The  Parameters  Of  The  Three  Pa¬ 
rameter  Weibull  Distribution 

The  likelihood  function  for  the  three  parameters  weibull  is  given  by: 
L(xj,---,xn,6,0,0)  =  f[/(x 

i=  l 


=  (^Tn 

i=i 


(x,-  —  sf~'  exp 


(-e-6 1  (x,-  -  if) 


Which  gives  the  following  set  of  equations  upon  the  differentiation  of  the  log 
likelihood  function  w.r.t  the  three  unknown  parameters. 
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i-n/3/e)  +  0e-i0+l)'£(xi-6f  =  0  (8U) 

1=1 


(7? //?)  —  nlnO  +  ^  /n  (x,  —  <5)  +  0  slnO  ^  (x,  —  8)^ 
i=l  1=1 


-r/3E  [(*«'  -6fln(Xi  -6)]  =  0 


1=1 


-  (0  -  1)  £  (x,-  -  8)~l  +  0e~^  (*i  -  S) 


P-i 


=  0 


(82) 


:=1 


The  solution  of  these  equations  gives  a  vector  0;  =  (8,0,$)  that  maximizes 
the  log  likelihood  function  (also,  maximizes  the  likelihood  function). 

The  first  equation  gives  the  parameter  9  as  a  function  of  8  and  $  in  the  form: 

0  =  0(6,0)  (8:1) 

while  the  other  two  equations  are  not  explicitly  solvable  for  0,8.  By  substituting  0 
from  the  first  equation  into  the  other  two  equations,  these  last  two  equations  become: 
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0  (8-1) 


1=1 


J2(x>  -  6f 


y:  (x{  —  6)^  In  (xj  —  S) 

■i=i 


-(1  -  fi)Yl(xx  -  6)  1  +  n(3 

i=i 


E  <=.  -  «5)fl 


p'.-fl''1 


0  (85) 


The  system  of  the  3  non  linear  equations  for  the  maximum  likelihood  in  0  = 
(6.0,3)  is  solved  using  a  numerical  technique.  The  method  is  known  as  the  hybrid 
method.  This  method  is  basically  an  iterative  method  based  on  Newton-Raphson 
method,  where  the  equations  have  the  form: 

L,(Q)  =  Li(6,O,0)  =  0  ,*  =  1,2,3  (86) 

where  the  vector  0  represents  the  triplet  of  the  Weibull  parameters  (location,  scale 
and  shape).  In  this  case  the  Newton-Raphson  solution  for  these  equations  takes  the 
form: 

Q(^)  ^  QW +^L'(Qk)]~1  L(Q(k))  ,k  =  0, 1, ...  (87) 

where  L'(Q )  denotes  the  Gateaux  derivative  of  L,  where  L  is  Gateaux  differentiable 
at  0  if  3  a  linear  operator  A  9: 


||  L(Q  +  th)  -  L(Q)  -  tAh 

lim - 


=  0  V/iG  1Z{3) 


(88) 
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This  method  has  a  quadratic  convergence  properties,  however  it  suffers  from 
the  pitfall  of  failure  to  converge  if  the  initial  guess  0^  is  far  away  from  the  solution 
05- 

Several  different  modifications  were  introduced  to  overcome  that  problem. 
Among  these  methods  are  the  norm  reducing  method  where  the  derivative  is  multi¬ 
plied  by  a  factor  such  that  the  norm  will  be  non-decreasing  as  the  iterations  progress. 
Another  method  is  to  ensure  that  the  derivative  is  non-singular  by  adding  a  constant 
to  its  diagonal  elements  such  that  the  new  matrix  is  non-singular  when  the  derivative' 
is  singular.  A  third  method  is  by  occasionally  computing  the  derivative.  A  more 
detailed  discussion  of  such  methods  is  due  to  Ortega  (1970). 

The  difficulty  of  such  basic  methods,  is  in  the  need  to  compute  3  components  of 
L  and  9  entries  of  U.  Several  other  modifications  are  introduced  by  Powell  (1970)  to 
alleviate  such  a  problem  by  avoiding  the  direct  computation  of  L'  through  replacing 
it  by  the  difference  approximations.  Harter  and  Moore  in  their  1965  paper  solved 
the  system  of  the  nonlinear  equations  for  joint  maximum  likelihood  estimation  from 
complete  and  censored  samples  of  the  three  parameter  Weibull  (  also  of  the  three 
parameter  Gamma).  The  proposed  iterative  procedure  was  applied  to  both  general 
case  as  well  as  cases  when  any  one  or  any  two  of  the  three  parameters  were  known. 
The  iterative  scheme  used  here  was  proposed  by  Powell  (1970)  where  the  derivative 
was  not  iust  scaled  by  a  small  factor  but  by  introducing  a  negative  multiple  of  the 
gradient  of  L(0)  such  that  the  direction  for  the  correction  in  the  different  iterations 
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will  be  sensible  as  the  Jacobian  is  almost  singular. 

The  method  can  be  applied  in  two  cases:  when  the  first  derivative  L'  is  given 
or  when  it  is  numerically  approximated.  Since  in  our  case,  the  functional  form  for 
the  derivative  is  not  complicated,  the  approach  when  the  Jacobian  is  given  is  chosen 
to  be  used. 

Methodology 

The  technique  is  basically  a  modification  of  Levenberg/Marquardt  idea  for  the 
classical  Newton-Raphson  iterative  scheme  for  the  solution  of  a  nonlinear  system  of 
equations  through  the  usage  of: 

(1)  A  negative  multiple  of  the  gradient  of  L(Q)  to  avoid  the  near  singularity 
of  the  Jacobian  matrix. 

(2)  A  flexible  choice  of  the  difference  between  0(fc+1)  and  Q(t)  in  each  step  is 
used  to  decrease  the  number  of  iterations  depending  on  the  increase  or  decrease  of 
1(0). 

The  running  time  of  the  algorithm  depends,  in  general,  on  the  number  of 
equations,  the  function  behavior  of  L(0),  the  initial  or  the  starting  point  0(o),  and 
the  accuracy  required  in  terms  of  the  step  difference  and  the  norm. 

An  accuracy  of  .01  was  used  for  the  absolute  difference  between  two  successive 
0's  while  the  Euclidean  norm  accuracy  was  relaxed  since  the  MISE  criteria  is  to 
be  used  latter  for  the  comparison  and  the  interest  was  in  the  convergence  of  the  0 
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parameter  mainly. 


The  algorithm  did  not  converge  in  a  few  cases  (24  cases)  which  were  excluded 
from  the  Monte  Carlo  results.  This  happened  because  the  method  was  searching 
for  a  zero  of  the  system  of  nonlinear  equations  L(Q)= 0  by  minimizing  the  quadratic- 
form  Lt(0)  L(Q)  or  the  sum  of  squares  of  the  maximum  likelihood  ecjuations.  In 
which  case  the  minimum  would  not  give  a  zero  of  the  system. 

The  initial  guess,  is  chosen  to  be  the  same  for  all  of  the  different  Monte  Carlo 
samples  of  size  1UUU. 

It  was  proved  by  Powell  in  1970  that  the  iterations  stops  due  to  one  of  the  men¬ 
tioned  stopping  rules  or  otherwise  the  solution  converges  to  a  solution  0*  providing 
that  the  Jacobian  matrices  are  bounded  and  L(Q°)  is  finite.  Powell  also  proved  that 
the  algorithm  will  stop  after  a  finite  number  of  iterations  by  one  of  the  two  stopping 
rules  providing  that  £,(0)  is  of  continuous,  bounded  first  derivatives. 

Stopping  Criterion 

In  addition,  the  technique  introduces  two  stopping  criterion: 

First  is  step  length  in  two  successive  iterations  which  is  taken  as  .01. 

Second  is  the  maximum  number  of  iterations  which  is  taken  as  1000. 
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Results 


The  results  from  the  previous  application  are  shown  on  tables  5  to  table  17  at 
the  end  of  chapter  IV  where  cases  of  shape  parameter  1,  2,  3  and  4  for  sample  sizes 
1(J,  20,  and  30  with  location  10,0  and  scale  5.0  are  given.  The  tables  show  the  sample 
used  for  each  case.  The  integrated  square  error  (ISE)  and  the  function  norm  were* 
also  given  as  measures  for  the  closeness  and  accuracy  of  the  nonlinear  solution.  The 
mean  integrated  square  error  from  the  Monte  Carlo  experiment  are  shown  at  the  end 
of  the  next  chapter  where  it  will  be  compared  with  the  results  from  the  minimum 
distance  estimation  technique. 
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VI.  Minimum  Distance  Estimation 


Introduction 

Minimum  distance  estimation  (MDE)  was  proposed  by  Wolfowitz  (Wolfowitz. 
1950).  Parr  and  Schucany  demonstrated  the  robustness  of  MDE  in  predicting  the 
location  of  symmetric  distributions  (Parr  and  Schucany,  1980).  Hobbs,  Moore,  and 
James  (Hobbs  and  others.  1984)  used  MDE  to  find  the  location  of  the  gamma  distri¬ 
bution.  Similarly,  Hobbs,  Moore,  and  Miller  (Hobbs  and  others,  1985)  used  MDE  to 
estimate  the  location  of  the  Weibull.  In  recent  research  (Gallagher  and  Moore,  1989) 
the  previous  work  was  extended  by  applying  MDE  to  all  the  distribution  parameters 
and  by  testing  the  robustness  of  MDE. 

MDE  selects  as  estimates  those  p.d.f  parameters  which  minimize  the  discrep¬ 
ancy  between  the  sample  data  and  the  estimated  distribution.  The  distance  mea¬ 
sures,  which  are  minimized  are  ”  Goodness  of  fit  statistics”  (g.o.f). 

The  MDE  has  the  following  characterization  and  properties: 

1.  Not  susceptible  to  outliers  (Parr  and  Schucany,  1980). 

2.  Statistically  consistent  (Wolfowitz,  1957). 

3.  Easily  applied  to  all  the  parameters  (Parr  and  Schucany,  1980). 

A  series  of  logical  candidates  for  the  distance  estimation  task  is  studied  by 
Fuchs  (Fuchs,  1984). 
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This  series  includes: 


-  General  exponential  power  distribution. 

-  Generalized  beta  distribution. 

-  Generalized  gamma  distribution. 

-  Generalized  t  distribution. 

-  R-S  distribution:  which  was  originally  developed  to  generate  random  variates 
(Ramberg  and  Schmeister  1979).  It  is  a  generalization  of  Tukey’s  lambda  function 
and  can  be  used  to  model  a  wide  variety  of  data.. 

The  probability  density  function  of  the  R-S  distribution  is  given  in  terms  of 
the  percentile  function,  R(p) 

f{x\p,a,b,  c,  d)  =  f{R{p))  =  (cpc_1  +  d(l  -  p)d~l)/b  (89) 

R(p)  =  a+  {pc  -  (1  -p)d)/b  (90) 

where  —  oo  <  a  <  x  <  oo  ;  -oc  <  a,  6,  c,  d  <  oo  ,  0  <  p  <  1 

-  Generalized  life  model:  developed  by  Moore  and  Bilikan,  which  includes  the 
Weibull  and  the  Raleigh  distribution  as  a  special  case.  The  p.d.f  is  given  by: 

/  (x:a.b,g(x))  =  bg'{x)  ( g[x))b~l  exp  (-  {g{x))b  /a)  /a  (91) 

where  g(x)  €  Rl  ,  limr_0+  g(x)  =  0  ,  lim^.^ ^  g(x)  =  oo 
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and  g(x)  is  strictly  increasing  ,0  <  x  ,  a,l>  <  oo 


Minimum  Distance  Estimation  For  The  Three  Parameter  Weibull  Distri¬ 
bution 

The  3-parameter  Weibull  density  function  is  given  by: 

.  ( x  —  8\^  1  ( x  — 

/(*)  =  ~Q  (' — 0—  J  exP  -  (  — y-  I  ,6<x,0jj>  0  (02) 

with  expected  value 

E(x)  =  6  +  ev  {£±1) 

and  with  variance 


Ill  is  used  with  a  Gaussian  kernel  which  is  defined  as: 


ft  \  ^  v-'  r  ^ x  ' 

/<U  =  ^Xh 


i=i 


1  "  1  (X  -  AV 2 

—  >  -7=  exp 

nh  y/2ir 


The  C.D.F  of  this  kernel  density  F(x)  is  given  as: 


*>  -  /  =  £3?“^* 


/  S  2 


dx 


If  /  1 

“  n  T  J  elp 


X  -  X , 


dx 


=  At 


n  —  V  h 


where  <F(t)  denotes  the  C.D.F  for  a  standard  normal  random  variable. 


The  Cramer  von  Mises  statistic  W2  is  used.  This  g.o.f.  statistic  is 


W2  =  n  J  [F(.r)  —  F0(x)\2  dF0(x) 


or  the  computational  formula: 


»-2  =  E 

j=i 


12  n 


G8 


(ttU) 

(«J7) 

m 

(09) 

(100) 

defined  as: 

(101) 


+ 


(102) 


As  it  was  noted  early,  the  optimal  value  of  the  window  width  h  (  in  the  M1SE 
sense)  depends  on  the  choice  of  the  kernel  K,  the  underlying  unknown  density  f(x) 
and  the  sample  size  i.e 


Kvt  =  /i(A')./2(/(*))./3(«)  (KH) 

A  reasonable  approximation  for  this  optimal  value  for  a  normal  sample  is  h  = 
kn~z  where  k  is  a  real  constant  (see  equation  38).  Although  this  approximation 
simplifies  the  optimal  expression  for  the  window  width  and  works  fine  with  the 
normal  distribution,  it  is  not  as  good  for  other  distributions.  This  leads  to  the  idea 
of  introducing  the  underlying  density  in  another  approximating  expression  for  that 
h.  The  explicit  expression  for  hopt  is  given  as: 

h0pt  =  mj2/5  {/  I\2(t)dt^  {Jf-ixfdx}  n~l,b  (104) 


where: 

r??2  denotes  the  kernel  second  moment. 
In  case  of  a  Gaussian  kernel: 


m2  =  J  t2 I\(t)  dt 
=  v(t) 

=  1  (105) 
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also,  /  K2(t)dt  is  simply  equal  to  ^77 


Now, let 


= cip  (y 


(IOC) 


Hence,  /(.r)  can  be  written  as 


J=(l)  “  v^fn/.S5'(l) 


(108) 


(100) 


J'(x)  =  exJ-l{t-h 


=  5t(a-)/i(.T) 


(110) 
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thus 
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=  exp  --L^(Xi  -  Xjf  gtJ  (x)  ,2\fzh 


(117) 


where  g,:{x)  is  a  normal  density  distribution  with  mean  ,  and  variance  2 h7 

Now,  let  lt{x),  lj(x)  be  written  as  /,,  /;  for  the  simplicity  of  the  notations. 
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On  substituting  this  previous  integral  for  the  integral  of  the  density  squared  in  the 
expression  for  the  optimal  h,  h  will  be  possibly  written  as: 


hopt  =  T  (h) 


(123) 


or  equivalently  as: 

Yj(/i)  =  hopt  —  T(/i)  =  0  (124) 

which  can  be  solved  by  one  of  the  generalization  methods  for  the  solution  of  one 
equation  in  one  unknown,  such  as  Newton’s  method,  secant  method,  Steffenson’s 
method  or  any  of  their  variations. 

The  Newton’s  method  has  the  form: 

hk+i  =  hk  -  [t;  (A*)]-'  T,  ( hk )  (125) 
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which  gives  a  quadratic  convergence  i.e 


|| hk+l  -  /i’||  <  c\\hk  -  /i'||  (126) 

for  a  sufficiently  close  hk  ,  h* 

An  alternative  for  computing  the  window  width  which  is  more  efficient  compu¬ 
tationally  and  gives  a  good  improvement  in  this  application  is  to  choose  an  empirical 
h  which  equals  $n~1^5  where  5  represents  the  sample  standard  deviation.  This  sug¬ 
gested  h  showed  MISE  which  is  close  enough  to  the  optimal  theoretical  and  since  it 
was  simple,  without  a  need  to  extensive  computations  and  face  degeneracy  sometimes 
compared  to  the  iterative  approach. 

Methodology 

The  Monte  Carlo  procedure  for  this  application  can  be  described  in  the  follow¬ 
ing  three  steps: 

Step  I 

-  Different  samples  from  Weibull  with  a  given  location,  scale,  and  shape  for 
different  sample  sizes  are  generated.  The  uniform  random  number  is  generated  using 
the  RNUN  routine  from  the  IMSL. 

-  The  Weibull  deviates  are  generated  using  the  inverse  C’.D.F  technique. 

Sten  II 
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-  The  MLE  estimators  for  the  3-parameters  are  computed  as  discussed  earlier. 

-  The  CvM  statistic  is  computed  for  the  estimated  density  with  MLE  for  the 
parameters. 

Step  III 

-  Minimizing  the  CvM  statistic  with  Q  as  the  decision  vector  and  with  the 
given  constraints  on  the  values  of  the  parameters. 

-  The  non-linear  program  is  solved  using  quasi  Newton  method. 

-  The  new  parameter  estimates  are  compared  with  those  of  MLE,  Using  the 
ISE  as  a  measure  for  the  comparison. 

Results 

Together  with  the  results  from  the  previous  chapter,  the  end  result  for  this 
application  is  shown  in  tables  6.  The  table  shows  that  both  the  MLE  method  and 
the  new  technique  are  statistically  the  same  for  shape  parameter  1.  However  the 
new  technique  shows  a  significant  improvement  over  the  MLE  method  for  shape  pa¬ 
rameters  2,  3,  and  4.  For  shape  parameter  2  the  new  method  gives  an  MISE  which 
is  5.3  times  smaller  than  that  of  the  MLE,  while  in  the  case  of  shape  parameter  3 
the  MISE  from  the  new  technique  is  about  6  times  smaller  than  that  of  the  MLE. 
For  shape  parameter  4  a  tremendous  improvement  is  obtained,  where  the  ratio  be¬ 
tween  the  MISE  for  MLE  to  that  of  the  new  tecnique  is  15.9,  which  shows  how  big 
the  improvement  is  due  to  the  new  technique.  Table  7  to  table  18  give  examples 
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from  each  of  the  four  shape  parameter  values  chosen  for  the  Monte  Carlo.  These 
tables  give  a  case  for  each  value  of  the  shape  parameter  1,  2,  3,  and  4  for  sample 
sizes  10,  20,  and  30  with  location  10.0  and  scale  5.0.  The  same  sample  is  used  to 
iteratively  solve  the  maximum  likelihood  nonlinear  equations.  The  integrated  square 
error  (ISE),  the  value  of  the  window  width  used,  and  the  optimal  value  for  the  CvM 
statistic  based  on  using  the  nonparametric  density  estimation  approach  are  given. 
The  graphs  for  these  cases  are  given  in  figures  1  to  24  while  the  next  table  shows  the 
resulting  MISE  together  with  its  standard  deviation  for  sample  size  20  for  the  differ¬ 
ent  parameter  values  for  both  the  new  proposed  estimation  technique  concurrently 
with  the  modified  nonlinear  method  for  solving  the  ML  equations. 

Table  6.  Results  from  M.C  size  1000  for  sample  size  20 


Weibull(loc.,  sea.,  sha.) 

MISEcvM 

MISEmle 

W(  10,5,1) 

.13209 

(.1723J,) 

.13678 

( .17820 ) 

W(10,5,2) 

.04970 

(.05061) 

.26364 

(.19757) 

W(  10,5,3) 

.03378 

(.03385) 

.20255 

(.17740) 

W(10,5,4) 

.02575 

(.02551) 

.40923 

(.49626) 

Table  7  to  table  18  show  that  the  choice  of  the  h  parameter  varies  from  sample 
to  sample  and  from  one  shape  parameter  to  another.  The  tables  also  show  variations 
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in  the  value  of  MISE  over  different  shape  parameters  for  the  Wei  bull  density.  These 
variations  in  h  value  together  with  the  variations  in  the  MISE  indicate  that  that 
the  method  used  is  an  adaptive  one  in  the  sense  that  the  choice  of  the  parameter  h 
which  is  data  dependent  varies  with  the  variation  of  the  distribution  shape  and  the 
particular  sample. 

Thus  the  final  conclusion  is  the  minimum  distance  estimation  method  using  the 
CvM  statistic  as  a  measure  of  the  difference  between  a  nonparametric  estimator  based 
on  a  suggested  window  width  and  a  parametric  density  with  unknown  parameters 
gives  in  general  a  much  smaller  MISE  value  than  the  maximum  likelihood  method. 
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Table  7.  Weibull  Sample  (  Shape  =  1.0  and  Sample  Size  =  10) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  1.0 
SAMPLE  SIZE  =  10 

Weibull  Data  Values 


10.009320 

10.226890 

10.798260 

10.866060 

11.054560 

11.788680 

14.245620 

14.910420 

17.955210 

24.277580 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.0080 

7.4580 

SCALE 

5.0000 

3.40U0 

6.8850 

SHAPE 

1.0000 

0.9500 

1.2990 

ISE 

0.1398 

0.1253 

Function  Norm 

3011.9329 

Window  Width 

2.8563 

Optimal  CvM 

0.0090 
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Parameter  estimation  for  the  three  parameter 
Weibull  density  W(10,5,l) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,1) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 
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Table  8.  Weibull  Sample  (  Shape  —  2  0  and  Sample  Size  =  10) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  2.0 
SAMPLE  SIZE  =  10 

Weibull  Data  Values 


10.905100 

12.259120 

13.099780 

14.168940 

14.219130 

15.972290 

16.971910 

16.296600 

16.373850 

16.426060 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.9041 

0.4785 

SCALE 

5.0000 

2.9963 

14.9260 

SHAPE 

2.0000 

0.9469 

7.1468 

ISE 

0.1231 

0.0221 

Function  Norm 

3234.4231 

Window  Width 

1.2068 

Optimal  CvM 

0.0085 
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Parameter  estimation  for  the  three  parameter 
Weibull  density  \V(10,5,2) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Figure  15.  p.d.f  for  W(  10,5,2)  with  N=10 
do 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,2) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Table  9.  Weibull  Sample  (  Shape  =  3.0  and  Sample  Size  =  10) 
TRUE  PARAMETERS  ARE 


Location  =  10.0 

Scale  =  5.0 

Shape  =  3.0 

SAMPLE  SIZE  =  10 

Weibull  Data  Values 


11.783980 

12.080590 

12.377060 

13.530170 

13.606270 

13.616520 

13.776460 

13.902000 

15.878190 

16.802490 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

1.7830 

10.6560 

SCALE 

5.0000 

4.8683 

3.5345 

SHAPE 

3.0000 

1.1199 

1.7830 

ISE 

0.3463 

0.1040 

Function  Norm 

13781.3096 

Window  Width 

0.9975 

Optimal  CvM 

0.0086 

85 


Parameter  estimation  for  the  three  parameter 
Weibull  density  \V(  10,5,3) 

Sample  size  10 

using  nonparametric  modifeid  MDE  teclinique 


Parameter  estimation  for  the  three  parameter 
Wei  bull  C.D.F  W(  10,5,3) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Figure  18.  C.D.F.  for  W(10,5,3)  with  N  =  10 
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Table  10.  Weibull  Sample  (  Shape  =  4.0  and  Sample  Size  =  10) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =4.0 
SAMPLE  SIZE  =  10 

Weibull  Data  Values 


11.963030 

12.860350 

13.219490 

14.176460 

14.512920 

14.892670 

14.919950 

15.371890 

15.641700 

16.557470 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.9620 

6.1854 

SCALE 

5.0000 

1.4014 

8.8608 

SHAPE 

4.0000 

1.6940 

5.8242 

ISE 

0.5027 

0.0134 

Function  Norm 

478115.3125 

Window  Width 

0.8778 

Optimal  CvM 

0.0085 
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Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,4) 

Sample  size  10 

using  nonparametric  modifeid  MDE  technique 


Figure  20,  C.D.F.  for  W(10,5,4)  with  N=10 
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Table  11.  Weibull  Sample  (  Shape  =  1.0  and  Sample  Size  —  20) 

TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  1.0 
SAMPLE  SIZE  =  20 

Weibull  Data  Values 


10.058190 

12.376410 

10.227110 

12.453510 

10.360260 

12.490680 

10.423800 

13.050770 

10.537260 

14.149840 

11.759740 

14.439350 

11.876000 

17.355110 

11.892060 

18.124390 

11.906630 

18.503469 

12.154340 

22.591120 

TRUE 

MLE 

MDCVM 

LOCATION 

10.00:0 

10.0572 

8.7660 

SCALE 

5.0000 

3.0587 

5.0997 

SHAPE 

1.0000 

0.9860 

1.2873 

ISE 

0.2165 

0.1183 

Function  Norm 

508.4898 

Window  Width 

1.8328 

Optimal  CvM 

0.0058 
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Parameter  estimation  for  the  three  parameter 
Wei  bull  C.D.F  W(  10,5,1) 

Sample  size  20 

using  nonparametric  morlifeid  MDE  technique 


Table  12.  Weibull  Sample  (  Shape  =  2.0  and  Sample  Size  =  20) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  2.0 
SAMPLE  SIZE  =  20 

Weibull  Data  Values 


13.447030 
13.502510 
13.528930 
13.905620 
14.555130 
14.711340 
16 . 06 1280 
16 . 373541 
16.520531 
17.934460 


TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.5384 

9.0263 

SCALE 

5.0000 

2.1504 

5.2000 

SHAPE 

2.0000 

1.0423 

2.2099 

ISE 

0.4962 

0 . 0737 

Function  Norm 

1238.4871 

Window  Width 

1.0842 

Optimal  CvM 

0.0050 

10.539380 

11.065610 

11.342130 

11.455670 

11.638990 

12.966260 

13.062680 

13.075760 

13.087580 

13.282030 
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Parameter  estimation  for  the  three  parameter 
Weibull  density  Vv'(  10,5,2) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(  10,5,2) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


10  15  20  25  :U) 


Figure  24.  C.D.F.  for  W(  10,5,2)  with  N=20 
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Table  13.  Weibull  Sample  (  Shape  =  3.0  and  Sample  Size  =  20) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  3.0 
SAMPLE  SIZE  =  20 


Weibull  Data 

Values 

11.133070 

13.902000 

11.783980 

13.943750 

12.080590 

13.963560 

12.196340 

14.240820 

12.377060 

14.698840 

13.530170 

14.805660 

13.606270 

15.686470 

13.616520 

15.878190 

13.625780 

15.968230 

13.776460 

16.802490 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.1321 

8.9949 

SCALE 

5.0000 

3.0818 

5.4584 

SHAPE 

3.0000 

1.1842 

3.2065 

ISE 

0.1308 

0.0552 

Function  Norm 

31529, 

.4902 

Window  Width  0.8209 

Optimal  CvM  0.0047 
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Parameter  estimation  for  the  three  parameter 
Weibull  density  W(  10,5,3) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Figure  25.  p.d.f  for  W(10,5,3)  with  N=20 


98 


Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(10,5,3) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Figure  26.  C.D.F.  for  W(10,5,3)  with  N=20 
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Table  14.  Weibull  Sample  (  Shape  =  4.0  and  Sample  Size  =  20) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =4.0 
SAMPLE  SIZE  =  20 

Weibull  Data  Values 


11.963030 

14.216850 

12.212220 

14.512920 

12.269390 

14.717380 

12.331440 

14.892670 

12.860350 

14.919950 

13.062320 

15.143880 

13.219490 

15.371890 

13.540850 

15.641700 

14.020930 

16.008890 

14.176460 

16.557470 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.9620 

8.0986 

SCALE 

5.0000 

3.4861 

6.5663 

SHAPE 

4.0000 

1.0087 

4.1925 

ISE 

0.1814 

0.0497 

Function  Norm 

3.8416 

Window  Width 

0.7450 

Optimal  CvM 

0.0051 
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Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  W(10,5,4) 

Sample  size  20 

using  nonparametric  modifeid  MDE  technique 


Table  15.  Weilnill  Sample  (  Shape  =  1.0  and  Sample  Size  =  30) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  1.0 
SAMPLE  SIZE  =  30 


Weibull  Data  Values 


10.058190 

11.892060 

14 . 149840 

10.227110 

11.906630 

14.439350 

10.360260 

11.930340 

15.096300 

10.423800 

12.154340 

17.355110 

10.537260 

12.376410 

18.124390 

10.761180 

12.453510 

18.503469 

11.346590 

12.490680 

19.405500 

11.719840 

12.508960 

19.906870 

11.759740 

13.050770 

22.591120 

11.876000 

13.821920 

35.215561 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.0572 

7.8721 

SCALE 

5.0000 

3.0323 

6.8966 

SHAPE 

1.0000 

1.0012 

1.3983 

ISE 

0.2278 

0.0528 

Function  Norm 

540.3060 

Window  Width 

2.5974 

Optimal  CvM 

0.0040 
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Parameter  estimation  for  the  three  parameter 
Wei  bull  density  W(  10,5,1) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  29.  p.d.f  for  W(10,5,l)  with  N=30 
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Parameter  estimation  for  the  three  parameter 
Weibull  C.D.F  \V(  10,5,1) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  30.  C.D.F.  for  W(10,5,l)  with  N=30 
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Table  1  fi.  Wei  bull  Sample  (  Shape  =  2.0  and  Sample  Size  =  30) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  2.0 
SAMPLE  SIZE  =  30 

Weibull  Data  Values 


10.539380 

13.075760 

14.555 130 

11 .065610 

13.087580 

14.711340 

11.342130 

13.106710 

15.047920 

11.455670 

13.282030 

16.064280 

11.638990 

13.447030 

16.373541 

1 1 . 950870 

13.502510 

16.520531 

12.594800 

13.528930 

16.857660 

12.932440 

13.541860 

17.038059 

12.966260 

13.905620 

17.934460 

13.062680 

14.371460 

21.228439 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

10.5384 

9.5878 

SCALE 

5.0000 

2.1066 

4.9859 

SHAPE 

2.0000 

1.0111 

1.9222 

ISE 

0.5117 

0.0248 

Function  Norm 

18.6620 

Window  Width 

1 . 1761 

Optimal  CvM 

0.0045 
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Parameter  estimation  for  the  three  parameter 
Wei  hull  C.D.F  W(  10,5,2) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  32.  C.D.F.  for  W(10,5,2)  with  N=30 
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Talde  17.  Weilmll  Sample*  (  Shape  =  3.0  and  Sample  Size  =  30) 


TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =  3.0 
SAMPLE  SIZE  =  30 


Weibull  Data  Values 


11.133070 

13.616520 

14.698840 

11.783980 

13.625780 

14.805660 

12.080590 

13.640750 

15.031900 

12.196340 

13.776460 

15.686470 

12.377060 

13.902000 

15.878190 

12.669780 

13.943750 

15.968230 

13.228930 

13.963560 

16.172211 

13.503290 

13.973240 

16.279989 

13.530170 

14.240820 

16.802490 

13.606270 

14.571660 

18.574381 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.1321 

11 . 1331 

SCALE 

5.0000 

2.6142 

3.5205 

SHAPE 

3.0000 

1.1138 

1.8773 

ISE 

0.2291 

0.0182 

Function  Norm 

10696.4004 

Window  Width 

0.8241 

Optimal  CvM 

0.0091 
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Parameter  estimation  for  the  three  parameter 
Weibull  density  W(  10,5,3) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  33.  p.d.f  for  W(10,5,3)  with  N=30 


Table  18.  Wei  bull  Sample  (  Shape  =  4.0  and  Sample  Size  =  30) 
TRUE  PARAMETERS  ARE 
Location  =  10.0 
Scale  =  5.0 
Shape  =4.0 
SAMPLE  SIZE  =  30 

Weibull  Data  Values 


11.963030 

13.634270 

15.143880 

12.212220 

14.020930 

15.145930 

12.269390 

14.054020 

15.312240 

12.331440 

14.176460 

15.371890 

12.840970 

14.216850 

15.398440 

12.860350 

14.512920 

15.618870 

13.062320 

14.717380 

15.641700 

13.130750 

14.755560 

16.008890 

13.219490 

14.892670 

16.538071 

13.540850 

14.919950 

16.557470 

TRUE 

MLE 

MDCVM 

LOCATION 

10.0000 

11.9620 

7.7954 

SCALE 

5.0000 

3.3463 

7.0515 

SHAPE 

4.0000 

1.0086 

4.7788 

ISE 

0.1633 

0.0195 

Function  Norm 

90.9906 

Window  Width 

0.6636 

Optimal  CvM 

0.0048 
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Parameter  estimation  for  the  three  parameter 
Weibull  density  W(  10,5,4) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Parameter  estimation  for  the  three  parameter 
Wei  bull  C.D.F  \V(10,5,4) 

Sample  size  30 

using  nonparametric  modifeid  MDE  technique 


Figure  36.  C.D.F  for  W(10,5,4)  with  N=30 
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VII.  GOODNESS  OF  FIT  APPLICATION 


Introduction 

When  a  sample  is  drawn  from  a  certain  distribution  it  is  hoped  that  its  empir 
ical  distribution  function  (E.D.F)  will  resemble  the  population  cummulative  distri¬ 
bution  function  (C’.D.F).  The  resemblance,  here  should  take  a  quantitative  moaning 
This  is  done  by  measuring  the  closeness  or  the  distance  of  the  E.D.F  to  the  ('.1)1 
Thus,  if  Fn(x)  represents  the  E.D.F  and  Fa(x)  represents  the  true  theoretical  C'.D.F 
then  many  different  ways  of  considering  the  distances  between  Fn(x)  and  Fa(.r)  sug¬ 
gest  a  wide  class  of  goodness  of  fit  statistics.  Cini’s  index  of  dissimilarity  as  the 
integrated  absolute  difference  between  both  C.D.F’s  is  among  these  fitting  criterion 
between  Fn(x)  and  F0(x).  This  index  is  also  modified  by  weighting  the  integral 
by  F'0{x).  Cramer  and  von  Mises  introduced  the  known  Cramer-von  Mises  statist  ic 
(CvM)  through  weighting  the  integral  of  the  squared  difference  between  the  C.D.F's 
by  F'o(x).  Anderson  and  Darling  introduced  the  Anderson  Darling  statistic  (AD 
by  weighting  Cramer-von  Mises  integral  by  f  (x)p  Watson.  Kolmogrov  and 

Smirnov,  and  Kuiper  are  also  other  examples  of  the  goodness  of  fit  statistics.  Such 
statistics  are  surveyed  by  Stephens  (1974).  The  computational  formulae  of  some  of 
these  statistics  are  given  below: 
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K-S  Statistic  I\ 


K  =  max  [D+  ,  D  ) 


( 127 


where, 


D+  =  sup(i/n  —  Fi) 


( 1  ■>*  i 


D~ 


sup 


Ft  - 


1  <  i  <  n 


( 1 25)) 


and  Fi  is  F0  at  the  ith  order  statistic. 

Anderson  -  Darling  A2 


Tl 

A2  =  —  n  —  l/n  ^2  (2 i  —  1)  [InF,  +  In  (1  —  Fn+i_,)] 

1=1 


Cramer  von  Mises  statistic  lh'2 


( i  d() ! 


1  =  1 


("l  —  1) 
2  n 


+ 


1 

12  n 


(ldli 


Kuiper  statistic  V 


V  =  D+  +  D~ 


( I  :*■_»  i 
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The  Watson  statistic  U2 


U2  =  W2  —  n(F  —  ,5)2 


where 


11  F 

f  =  Y  — 
hi  » 


li:» 


Modified  Goodness  Of  Fit  Test 

A  goodness  of  fit  test  based  on  the  E.D.F.,  where  the  parameters  are  estimated 
is  called  a  modified  goodness  of  fit  test. 

Basic  Characterization 

1)  If  the  tables  for  completely  specified  null  hypothesis  are  used  while  i  lie 
parameters  are  estimated,  this  makes  the  actual  a  error  much  smaller  and  biases  the 
test  towards  accepting  H0  even  without  testing. 


2)  When  the  parameters  are  estimated,  the  null  distribution  of  the  test  si  at  ist  h 
and  hence  the  percentage  points  will  not  depend  on  the  location  or  scale  parameter. 
However,  one  must  use  the  same  estimators  as  were  used  in  the  construction  of  the 


tables. 


Now,  let 


F  a  family  of  C.D.F’s  with  locati 


ion  and  scale  parameter  c.  (I  respee 


tively,  and  F0  the  C.D.F  when  inserting  estimators  for  c,  0  under  H 


'  li( )  or  simply 
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denoted  as 

Although  the  distribution  of  the  test  statistic  and  its  percentage  points  do  m»i 
depend  on  c  and  0 ,  one  should  use  tables  with  the  same  estimators  as  those  used  to 
construct  the  tables. 

A  modified  K-S  goodness  of  fit  test  by  Monte  Carlo  simulation  lot  the  normal 
distribution  with  /i,<r2  (Lilliefors,  1966)  and  for  the  exponential  distribution  with 
unknown  mean  (Lilliefors,  1967)  were  introduced  with  a  study  of  the  power  of  the 
test  which  showed  that  the  modified  K-S  test  had  higher  power  than  \2-test  for  the 
normal  case. 

Woodruff  et  al.  (1983)  and  Bush  et  al.  (1983)  derived  tables  for  modified  K-S. 
CAM  and  AD  tests  for  the  Weibull  distribution  with  shape  parameter  1  (two  param¬ 
eter  negative  exponential).  Their  study  showed  that  the  CvM  test  had  the  highest 
power  for  most  of  the  alternative  distributions  studied  when  the  null  hypotheses  was 
the  two  parameter  negative  exponential.  The}1,  in  addition,  studied  Weibull  with 
different  shape  parameters  and  showed  that  the  AD  statistic  was  the  most  powerful 
when  the  null  distribution  was  Weibull  with  shape  parameter  3.5.  A  relationship 
between  the  critical  values  and  the  inverse  of  the  shape  parameter  was  presented  for 
the  range  of  the  shape  parameters  studied. 

As  for  the  two  parameter  Weibull.  a  BLUE  and  BLIE  (best  linear  invariant 
estimator)  for  the  unknown  parameters  is  found  in  Mann  (1968)  using  the  fact  that 
two  parameter  Weibull  is  transformed  into  extreme  value  by  a  logarithmic  transfot  - 
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mation.  She  also  derived  a  goodness  of  fit  test  for  the  extreme  value  distribution  ul 
smallest  values. 

Tables  of  critical  values  for  the  modified  K-S,  CvM  and  AD  statistics  using  M.< 
techniques  for  the  extreme  value  distribution  where  the  MLE  for  the  parameters  i- 
used  are  derived  in  a  paper  by  Littelle  et.  al.  in  1979. 

Tables  for  the  percentage  points  for  the  modified  K-S,  AD  and  CvM  statistic' 
for  the  gamma  distribution  are  derived  in  Woodruff  et  al.  (1984). 

In  addition,  similar  tables  are  derived  for  1  he  critical  values  for  the  modified 
K-S,  AD  and  CvM  goodness  of  fit  for  the  logistic  distribution  with  unknown  shape 
and  location  parameter  using  MLE  to  estimate  the  parameters  (Woodruff  et  ah. 
1986). 

Porter  and  Moore  derived  tables  of  critical  values  for  the  modified  K-S.  Al) 
and  CvM  goodness  of  fit  statistics  for  the  Pareto  distribution  with  unknown  shape 
parameter.  The  powers  were  shown  for  eight  alternative  distribution.  In  addition 
they  derived  a  functional  relationship  between  the  shape  parameters  and  the  critical 
values  of  the  test. 

Yen  and  Moore  derived  tables  of  critical  values  for  the  modified  Al)  and  CvM 
goodness  of  fit  statistics  for  the  Laplace  distribution.  The  critical  values  were  tabled 
for  sample  sizes  n=5(5)50  and  significant  levels  a  —  .1..2..5.  The  AD  test  generally 
yielded  higher  power  than  the  CvM  test. 

Harter  et  ah  (1984)  modified  the  definition  of  the  C.D.F  at  the  i'1,  order  statis- 
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tic  to  obtain  a  modified  K-S  test  statistic  when  the  probability  mode]  is  comph  ieh 
specified.  They  have  shown  that  their  proposed  test  is  more  powerful  than  tin-  u-uni 
K-S  tests  for  small  to  moderate  sample  sizes. 

New  goodness  of  fit  tests  for  symmetric  alternatives  were  obtained  by  Moore 
for  the  normal  distribution  by  using  a  reflection  technique  in  which  the  data  points 
are  reflected  about  an  invariant  estimate  of  the  mean  and  is  used  lo  double  i  In- 
sample  size.  New  tables  were  derived  for  the  K-S.  AD  and  C'vM  statistics. 

A  similar  work  was  done  by  Woodruff  et  al.  for  the  uniform  distribution  and 
by  Yen  and  Moore  for  the  Laplace  distribution. 

As  a  final  note,  a  problem  arises  when  a  goodness  of  fit  test  fails  to  rejet  t  two 
families  of  distributions  which  means  that  the  test  does  not  sufficiently  discriminat ■ 
these  two  families.  Bain  used  a  likelihood  ratio  test  to  discriminate  normal  verm- 
two  parameter  exponential;  normal  versus  double  exponential;  normal  versus  Cauchy 
or  Weibull  versus  lognormal  and  extreme  value  versus  normal. 


Methodology 

Two  basic  test  Statistics  are  used  in  this  application.  This  statistics  arc  based 
on  the  Cramer  von  Mises  and  the  Anderson  Darling  statistics. 

(1)  Cramer  von  Mises  statistic 


The  C’ramer-von  Mises  is  defined  as  B  ‘  =  f 

Tl  J 


statistic  has  the  well  known  classical  results  that: 


F.A.r)  -  FJ.v) 


(IFj.r ).  Tim 
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-  is  a  directed  distance. 

i.e  for  any  proper  distribution  function  F\(x)undF2(x): 

W„(Fi ,  F2)  =  0  <=>  Fi(.r)  =  F2(x)  (1:'F>i 

and 

U"Z(Fl,F2)  +  W?l(Fl,r)>  WZ(FUF-)  ( i :5<>  1 

-  V1”Tj  is  symmetric. 

i.e  if  Fi(.)  and  F2(.)  are  continuous  then: 

W^FuF2)  =  W2(F2,F\)  ( I  -IT ; 

-  Since  0  <  F(r)  <  1  =>  0  <  \Y2{F\,  F2)  < 

(2)  Anderson  -  Darling  .4  2 

The  AD  statistic,  considered  one  of  the  Cramer- von  Mises  family,  is  defined 
as: 

OO 

Q  =  J  {Fn(x)  -  F(x)}2  T(x)  d,  F(x)  (IdS) 

-OO 

where  ^(x)  is  some  function  that  weights  the  square  of  the  difference  between  both 
distribution  functions.  The  CvM  statistic  sets  this  weight  equal  to  1.  W  hile  the  Alt 
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statistic  uses  this  weight  as  the  ratio  between  F(x)  and  l-F(x) 


,-l2  =  —  n  -  ( J/»)  ^(2/  -  1 )  \lo(j  (FfA'i,)))  +  log  1 1  (A (  I  i't  ■ 

1=1 

.4 2  =  —n  —  1/n  ^  (2?  -  1)  [/??F,  +  In  (l  -  Ftt+i-i)]  (I  H>i 

t=i 

In  this  context  a  gof  test  is  run  using  M.C  size  10UU.  The  test  is  based  on  i  lie 
AD  test  statistic  where  the  nonparametric  probability  is  used  in  place  ol  t  lit*  I.Df 
The  AD  statistic  is  more  sensitive  to  the  distribution  tail  length  by  the  construction 
of  the  weight  function  above.  As  for  the  properties  of  the  E.D.F  upon  which  it  seemed 
natural  to  use  the  E.D.F  for  goodness  of  fit.  of  Fo(.r)  (  the  theoretical  di.sl  ribul  ion  - 
is  its  uniform  convergence  and  almost  surely  to  Fo(.r).  Subjectively  it  can  !><•  slated 
that  reject  if  W2  is  large  and  accept  when  it  is  small. 

In  the  application  here  F0(.t)  is  assumed  to  be  univariate  continuous  distribu¬ 
tion  function.  This  means  that  Fo(A,)  will  be  uniformly  distributed  between  (0.1). 
The  asymptotic  behavior  of  W2  when  F0(.r)  =  F(.r )  is  given  by: 

F  {n\V2  <  a-}  —  Fw?i(x)  (M  It 

where 

FHv(*)  =  — "7=  .(lid  i 

-\/x 
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The  technique  used  is  based  upon  the  idea  of  using  the  nonparameti  ic  density 
estimator  in  place  of  the  E.D.F.  Hence  the  goodness  of  fit  application  document' 
this  other  new  application  with  complete  test  elements. 

The  Technique  And  The  Results 

The  Monte  Carlo  procedure  for  this  test  was  divided  to  3  basic  stages 
Stage  I 

(1)  Determine  the  Critical  Values  lor  the  test  Statistic  at  the  predetermined 
significance  levels  (  .01  ,  .05  (.05)  .20  ). 

(2)  Compute  the  value  of  H’2  for  each  of  the  1000  M.C  cases  as  a  measure  of 
the  distance  between  the  parametric  density  with  the  maximum  likelihood  estimator 
for  the  parameters  (x ,s2)  and  a  noiT^rametric  fit  for  each  sample. 

(3) The  1000  M.C  samples  yields  a  corresponding  sample  of  size  1000  for  M 
-Thus,  there  are  two  ways  to  go  to  find  the  distribution  of  IV2: 

(a)  Use  a  plotting  position. 

(b)  Fit  a  continuous  nonpara.met.ric  distribution. 

(4)  The  nonparametric  fit  is  used  and  the  inverse  function  of  the  corresponding 
C’.D.F  is  computed  at  the  different  levels  of  significance. 
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Stage  II 


(1)  The  corresponding  power  stud}'  for  the  hypotheses  is  conducted  under  //,, 
and  the  power  is  computed. 

(2)  The  test  shows  powers  which  were  reasonably  close  to  the  n- levels 

Stage  III 

The  members  of  the  following  family  of  distributions  is  used  as  alternalive 
distributions: 

-Uniform 

-\2  with  1  d.f 

-X2  with  4  d.f 

-Exponential 

-Cauchy 

-D.E 

-t-student  with  3  d.f 
-Logistic  distribution 

This  family  of  distributions  give  a  variety  of  shapes  and  characteristics.  The 
results  from  this  part  are  shown  in  the  following  tables.  The  tables  give  the  critical 
values  for  both  cases  when  the  CvM  statistic  is  used  and  when  the  AD  statistic  is 
used.  The  tables  also  show  the  power  of  both  tests  for  different  sample  sizes. 
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Table  19.  Critical  Value  for  the  New  Suggested  Test 


for  Sample  Size  =  5  (5)  60 
(Using  CvM) 

(at  Significance  Levels  . 2, . 15, . 1 , . 05, . 

.  01) 

N 

0 . 20 

0.15 

0 . 10 

0.05 

.  01 

5 

0.0341 

0.0352 

0.0364 

0.0382 

0.0406 

10 

0.0335 

0.0355 

0.0387 

0.0437 

0.0507 

15 

0.0355 

0.0385 

0.0418 

0.0478 

0.0568 

20 

0.0384 

0 . 0420 

0.0455 

0 . 0533 

0.0717 

25 

0.0397 

0.0436 

0.0495 

0 . 0568 

0.0742 

30 

0.0414 

0.0459 

0.0508 

0.0599 

0.0739 

35 

0.0419 

0.0467 

0.0520 

0.0623 

0.0786 

40 

0.0447 

0.0485 

0.0554 

0.0654 

0.0874 

45 

0.0473 

0.0522 

0.0590 

0.0719 

0.0918 

50 

0.0487 

0.0534 

0.0600 

0.0695 

0.0989 

55 

0.0504 

0.0556 

0.0637 

0.0753 

0 . 0970 

60 

0.0510 

0.0563 

0.0639 

0.0771 

0.0977 
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Table  20.  Power  of  Tests  for  Normal  Distriution 
with  Sample  Size  =  5  (5)  60 
(Using  CvM) 

(at  Significance  Levels  .2, .15, .1, .05, .01) 


N 

0.20 

0 . 15 

0.10 

0.05 

.  01 

5 

0.2072 

0.1667 

0.1024 

0 . 0572 

0 . 0121 

10 

0.2209 

0 . 1559 

0.1083 

0.0546 

0 . 0105 

15 

0.2115 

0 . 1575 

0 . 1032 

0.0481 

0 . 0105 

20 

0.1979 

0 . 1485 

0 . 1000 

0.0521 

0.0103 

25 

0.1990 

0 . 1504 

0.0968 

0.0505 

0.0099 

30 

0.2041 

0.1455 

0.0975 

0.0515 

0.0101 

35 

0 .1940 

0 . 1489 

0.1018 

0.0503 

0.0096 

40 

0.2011 

0.1502 

0.0968 

0.0482 

0.0100 

45 

0.1997 

0.1392 

0.0957 

0.0476 

0.0102 

50 

0.1951 

0.1427 

0.1018 

0.0491 

0 . 0098 

55 

0.1917 

0.1505 

0.0986 

0.0498 

0.0096 

60 

0.1965 

0.1477 

0.0972 

0.0483 

0.0097 
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Table  21.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size 


5 


(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi (1) 

Chi  (4) 

Expon . 

.20 

0.2590 

0 .4480 

0.2330 

0.3170 

.15 

0.1990 

0.3860 

0.1760 

0.2530 

.10 

0.1480 

0.3270 

0.1420 

0.2040 

.05 

0.0700 

0.2510 

0.0630 

0 .1420 

.01 

0.0070 

0.1170 

0.0170 

0.0380 

Sign,  level 

Cauchy 

D.E 

t  (3 ) 

Logistic 

.20 

0.4060 

0.1630 

0.2350 

0.1640 

.15 

0.3680 

0.1190 

0.1900 

0.1220 

.10 

0.3280 

0.0850 

0.1450 

0.0840 

.05 

0.2550 

0.0490 

0.0930 

0.0430 

.01 

0 . 1640 

0.0090 

0.0320 

0.0050 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 
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Table  22.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  10 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.5310 

0 . 8090 

0.4520 

0 . 6420 

.15 

0.4660 

0.7660 

0.3950 

0.5650 

.10 

0.3320 

0.6640 

0.2810 

0.4450 

.05 

0.2050 

0.5000 

0.1630 

0.2920 

.01 

0.0700 

0.2750 

0.0580 

0.1370 

Sign,  level 

Cauchy 

D.E 

Logistic 

.20 

0.5280 

0.2320 

0.2540 

0.2110 

.15 

0.5030 

0.1930 

0.2120 

0.1710 

.10 

0.4560 

0.1310 

0.1460 

0.1020 

.05 

0.3790 

0.0630 

0.1000 

0.0470 

.01 

0.2790 

0.0250 

0.0450 

0.0150 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 
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Table  23.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  15 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.6600 

0.9370 

0.5550 

0.7940 

.15 

0.5810 

0 . 9050 

0.4750 

0.7400 

.10 

0.5040 

0.8540 

0.3800 

0.6590 

.05 

0.3500 

0.7610 

0.2830 

0.5180 

.01 

0.1590 

0.5730 

0 . 1260 

0.3030 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.6390 

0.2040 

0.2560 

0.2040 

.15 

0.6110 

0.1640 

0.2060 

0.1500 

.10 

0.5810 

0.1230 

0.1590 

0 .1110 

.05 

0.5320 

0.0770 

0.1060 

0.0600 

.01 

0 .4540 

0.0360 

0.0560 

0.0190 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 
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Table  24 . 

Power  of  Tests  for 

(Using  CvM) 
(Normal  against  one 

Normal  Distriution  with 

of  the  following  : ) 

Sample  Size  = 

Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Exp on . 

.20 

0.7530 

0.9800 

0.6740 

0 .8580 

.15 

0.6670 

0.9690 

0.5910 

0.8130 

.10 

0.5820 

0 . 9550 

0.4990 

0.7580 

.05 

/ 

0.3930 

0.8890 

0.3500 

0.6300 

.01 

0.1070 

0.6060 

0 . 1240 

0.3100 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0 . 6610 

0.1930 

0.2550 

0.1760 

.15 

0 . 6410 

0.1560 

0.2140 

0.1320 

.10 

0.6160 

0.1210 

0.1850 

0.1100 

.05 

0.5690 

0.0800 

0.1350 

0.0510 

.01 

0.4670 

0.0190 

0.0620 

0.0060 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


Table  25.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  25 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0 . 8360 

0.9940 

0.7570 

0 . 9250 

.15 

0.7680 

0 . 9900 

0.7100 

0.8950 

.10 

0 . 6360 

0 . 9790 

0.6090 

0.8350 

.05 

0.5050 

0 . 9520 

0.4780 

0.7520 

.01 

0 . 1910 

0.8070 

0.2340 

0.4980 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.7410 

0.1790 

0.2530 

0.1610 

.15 

0.7130 

0.1330 

0.2200 

0.1180 

.10 

0.6840 

0.0940 

0.1700 

0.0780 

.05 

0.6420 

0.0650 

0.1270 

0.0400 

.01 

0.5640 

0.0260 

0.0740 

0.0040 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t  ( 3 )  =  t-distribution  with  3  d.f 
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Table  26 . 

Power  of  Tests  for 

(Using  CvM) 
(Normal  against  one 

Normal  Distriution  with 

of  the  following  : ) 

Sample  Size  = 

Sign,  level 

Uniform 

Chi (1) 

Chi (4) 

Expon . 

.20 

0 .8720 

0.9980 

0 . 8140 

0.9690 

.15 

0.8210 

0.9950 

0.7350 

0.9530 

.10 

0.7500 

0.9880 

0.6600 

0.9290 

.05 

0.6010 

0.9740 

0.5370 

0.8540 

.01 

0.3470 

0.9200 

0.3340 

0 .6900 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.8080 

0.1680 

0.2690 

0.1430 

.15 

0.7740 

0.1390 

0.2320 

0.0990 

.10 

0.7450 

0.1000 

0.1980 

0.0680 

.05 

0.6760 

0.0690 

0.1550 

0.0330 

.01 

0.6080 

0.0360 

0 . 1120 

0.0100 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 


Table  27.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size 


35 


(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi (1) 

Chi (4) 

Expon . 

.20 

0 . 9170 

1.0000 

0 . 8740 

0 . 9800 

.15 

0 . 8770 

0 .9990 

0 . 8310 

0 .9690 

.10 

0.8280 

0 .9980 

0 . 7780 

0.9470 

.05 

0.6830 

0 . 9900 

0.6440 

0.8750 

.01 

0 . 4080 

0 .9620 

0 . 4230 

0.7420 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.8840 

0.1720 

0.3040 

0.1450 

.15 

0.8610 

0.1380 

0.2630 

0.1020 

.  10 

0.8350 

0.1100 

0.2380 

0.0750 

.05 

0.7720 

0.0780 

0.1860 

0.0370 

.  01 

0.6830 

0.0360 

0.1220 

0.0080 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 
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Table  28.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  40 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.9370 

0 . 9990 

0 . 9000 

0 .9900 

.15 

0 . 9180 

0 .9990 

0.8680 

0 . 9830 

.10 

0.8560 

0 . 9980 

0.7830 

0.9710 

.  05 

0.7380 

0.9960 

0 . 6740 

0.9320 

.01 

0.3820 

0.9750 

0.4380 

0.8260 

Sign,  level 

Cauchy 

D.E 

HH 

Logistic 

.20 

0 . 9390 

0.1710 

0.3110 

0.1290 

.  15 

0 . 9200 

0 . 1480 

0.2800 

0.0990 

.10 

0.8910 

0.1090 

0.2340 

0.0640 

.05 

0.8420 

0.0710 

0.1860 

0.0360 

.01 

0.7380 

0.0280 

0.1250 

0.0070 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 
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Table  29.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  45 

(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.9530 

1.0000 

0.9040 

0 .9930 

.15 

0 . 9320 

1.0000 

0 . 8720 

0 . 9890 

.10 

0 . 8910 

1.0000 

0 . 8130 

0.9780 

LO 

o 

0.7390 

0.9990 

0.7010 

0 . 9520 

.01 

0.4480 

0 .9900 

0.5110 

0.8710 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

O 

CM 

0 . 9410 

0.1730 

0.3100 

0.1250 

.15 

0.9290 

0.1460 

0.2770 

0.0870 

.10 

0.9120 

0.0980 

0.2370 

0.0620 

.05 

0 . 8770 

0.0670 

0.1800 

0.0270 

.01 

0.7990 

0.0290 

0 . 1250 

0.0080 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distr ibution  with  3  d.f 
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Table  30.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  50 

(Using  CvM) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi (1) 

Chi  (4) 

Expon . 

.20 

0.9630 

1.0000 

0.9410 

0.9990 

.15 

0 . 9440 

1.0000 

0.9210 

0.9990 

.  10 

0 . 9170 

1  .0000 

0 . 8890 

0.9960 

.05 

0.8360 

0 .9990 

0 . 8170 

0.9860 

.01 

0.4620 

0 . 9910 

0.5450 

0 . 9040 

Sign,  level 

Cauchy 

D.E 

Logistic 

.20 

0.9610 

0.1920 

0.3260 

0.1240 

.15 

0.9540 

0.1610 

0.2970 

0.0950 

.  10 

0.9440 

0 . 1250 

0.2520 

0.0610 

.05 

0  .  9250 

0.0780 

0.1990 

0.0300 

.01 

0.8330 

0.0190 

0.1310 

0.0060 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 
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Table  31.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  55 

(Using  CvM) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.9810 

1.0000 

0.9600 

0.9990 

.15 

0.9600 

1 . 0000 

0.9490 

0 .9980 

.10 

0 . 9190 

1.0000 

0.9110 

0.9940 

.05 

0 . 8430 

1.0000 

0.8400 

0 . 9870 

.01 

0.5910 

1.0000 

0.6620 

0.9580 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9800 

0.2030 

0.3420 

0.1050 

.15 

0 . 9730 

0 . 1710 

0.3180 

0.0820 

.10 

0 . 9600 

0 . 1190 

0.2760 

0 .0440 

.05 

0 . 9440 

0.0690 

0.2340 

0.0200 

.01 

0.8970 

0.0300 

0.1680 

0.0040 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 
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Table  32.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size 


60 


(Using  CvM) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.9830 

1.0000 

0.9810 

1.0000 

.15 

0.9720 

1 . 0000 

0 . 9770 

0 . 9990 

.10 

0.9480 

1.0000 

0 . 9550 

0 . 9980 

.05 

0.8780 

1.0000 

0.8890 

0.9960 

.01 

0.6750 

0.9990 

0.7290 

0.9740 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9850 

0.2180 

0.3590 

0.1120 

.15 

0.9810 

0.1700 

0.3200 

0.0760 

.10 

0 . 9730 

0.1360 

0.2840 

0.0520 

.05 

0 . 9600 

0.0770 

0.2280 

0.0200 

.01 

0.9290 

0.0330 

0.1750 

0.0080 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 


138 


Table  33.  Critical  Value  for  the  New  Suggested  Test 
for  Sample  Size  =  5  (5)  60 
(Using  AD) 

(at  Significance  Levels  .2,  .15,  .1,  .05,  .01) 


N 

0.20 

0 .15 

0.10 

0.05 

.  01 

5 

1  .  1629 

1.2454 

1 .4128 

1.6494 

2 . 1284 

10 

1.5496 

1.6466 

1.7901 

2 . 1274 

2 . 8964 

15 

1.9488 

2.0551 

2.1764 

2.4629 

3.2314 

20 

2.2129 

2.3563 

2.5602 

2.8515 

3.6844 

25 

2.4551 

2.5653 

2.7135 

3.0238 

3.6552 

30 

2.6596 

2 .7707 

2.9593 

3.2455 

4.1849 

35 

2.8755 

2.9862 

3.1885 

3.4801 

4.2493 

40 

3.0863 

3.2155 

3.3152 

3.7069 

4 . 3719 

4  5 

3.2613 

3.4019 

3.5699 

3.8462 

4 . 6924 

50 

3.4458 

3.5522 

3.7423 

4 . 0242 

4 . 6589 

55 

3.6080 

3.7104 

3.9092 

4 . 1381 

4.8513 

60 

3.7494 

3.8857 

4.0693 

4.3398 

4 . 9610 
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Table  34 .  Power  of  Tests  for  Normal  Distriution 
with  Sample  Size  =  5  (5)  60 
(Using  AD) 

(at  Significance  Levels  .2, .15, .1, .05, .01) 


N 

0.20 

0 .15 

0.10 

0.05 

.01 

5 

0.1640 

0 .1170 

0.0600 

0 . 0270 

0 . 0050 

10 

0.2140 

0 . 1700 

0 .1100 

0.0520 

0.0130 

15 

0.1950 

0  .  1580 

0.1140 

0.0660 

0.0090 

20 

0 .1780 

0.1140 

0.0630 

0.0290 

0 . 0030 

25 

0 .1760 

0.1400 

0.1030 

0.0490 

0 .0140 

30 

0 . 1850 

0.1360 

0.0830 

0.0400 

0.0080 

35 

0.1910 

0.1490 

0.0870 

0.0440 

0.0040 

40 

0.1820 

0.1340 

0.1070 

0.0340 

0.0050 

45 

0.1680 

0.1110 

0.0660 

0.0310 

0.0002 

50 

0 .1770 

0.1280 

0.0780 

0.0460 

0.0050 

55 

0.1870 

0.1420 

0.0960 

0.0510 

0.0100 

60 

0 .1850 

0  .  1280 

0.0900 

0.0420 

0.0100 
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Table  35.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.1590 

0.6210 

0.3760 

0.4950 

.15 

0 . 1140 

0.5810 

0.3190 

0.4460 

.10 

0.0540 

0.4960 

0.2220 

0.3400 

.05 

0.0160 

0.3710 

0.1380 

0.2280 

.01 

0.0050 

0.2160 

0.0310 

0.1000 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.4590 

0.1860 

0.2740 

0.1510 

.15 

0.4210 

0.1320 

0.2160 

0.0980 

.10 

0.3370 

0.0720 

0.1440 

0.0570 

.05 

0.2330 

0.0410 

0.0820 

0.0220 

.  01 

0.1150 

0.0090 

0.0390 

0.0060 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 
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Table  36.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  10 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.2170 
(  .2688) 

0 . 9410* 

( .7850) 

0.6460* 

(.4138) 

0.8120* 

(  .5710) 

.15 

0.1690 
(  .2112) 

0.9280* 

( .7366) 

0.5920* 

(.3488) 

0.7670* 

(  .5120) 

.10 

0.1090 

(.1420) 

0 .8970* 
(.6608) 

0.5320* 

(  .2716) 

0.7180* 

(  .4318) 

.05 

0.0510 

(.0724) 

0 . 8240* 

( .5420) 

0.3720* 

( . 1806) 

0 . 5790* 

(  .3208) 

.01 

0.0080 
(  .0128) 

0.6350* 

( .3430) 

0.1700* 

(  .0708) 

0.3420* 

(  .1612) 

Sign,  level 

Cauchy 

D.E 

wmm 

Logistic 

.20 

0.6980 

(.7306) 

0.3200 
( .3604) 

0.3750* 

(.3610) 

0.2530* 

( .2486) 

.15 

0.6570 
( . 6998) 

0.2760 
( .3030) 

0.3190* 

(.3066) 

0.1860 
(  .1990) 

.10 

0 . 5870 
(  .  6532) 

0.2090 
(  .2376) 

0.2540* 

( .2500) 

0.1240 
(  . 1418) 

.05 

0.4510 
( . 5884) 

0.1060 

(.1572) 

0.1540 
( .1726) 

0.0590 
(  .0874) 

.01 

0.2830 
(  .4660) 

0.0340 
(  .0646) 

0.0590 
( .0838) 

0.0150 

(.0252) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t ( 3 )  =  t-distribution  with  3  d.f 
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Table  37 

.  Power  of  Tests  for 

(Normal  against 

Normal  Distriution  with  Sample 
(Using  AD) 

one  of  the  following  : ) 

Size  =  15 

Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.2200 

0 . 9910 

0.7400 

0.9190 

.15 

0 . 1720 

0 . 9890 

0.6880 

0.8960 

.10 

0 . 1250 

0 . 9830 

0.6400 

0.8650 

.05 

0.0590 

0 . 9590 

0.5250 

0.7910 

.01 

0.0100 

0.8780 

0.2730 

0.5940 

Sign,  level 

Cauchy 

D.E 

HIM 

Logistic 

.20 

0.8500 

0.3430 

0.3940 

0.2240 

.15 

0.8090 

0.2940 

0.3430 

0.1730 

.10 

0 .7780 

0.2320 

0.2940 

0.1210 

.05 

0.7000 

0.1430 

0.2110 

0.0640 

.01 

0.5460 

0.0370 

0.0880 

0.0180 

Chi(k)  = 

Chi  square 

with  k  d. 

f 

Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distr ibution  with  3  d.f 
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Table  38.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0 . 2440 

0.9980 

0.8700 

0.9660 

.15 

0 . 1570 

0.9980 

0.8240 

0 . 9500 

.10 

0.1010 

0.9960 

0.7590 

0 . 9210 

.05 

0.0500 

0 . 9890 

0.6430 

0.8760 

.01 

0.0080 

0.9580 

0.3610 

0.7110 

Sign,  level 

Cauchy 

D.E 

9BUHI 

Logistic 

.20 

0.9160 

0.4010 

0 .4640 

0.2470 

.15 

0.8900 

0.3490 

0.3970 

0.1780 

.10 

0.8570 

0.2520 

0.3280 

0.1100 

.05 

0.7900 

0 . 1550 

0.2430 

0.0510 

.01 

0.6410 

0.0370 

0.1090 

0.0120 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 


t ( 3 )  =  t-distribution  with  3  d.f 
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Table  39.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  - 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better) 
(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi (1) 

Chi  (4) 

Expon . 

.20 

0.2800 

1 . 0000* 

0 . 9170* 

0.9960* 

( .3704) 

{ .9904) 

( . 6566) 

( .8914) 

.15 

0.2060 

1 . 0000* 

0 . 8850* 

0.9910* 

(.2998) 

( .9860) 

( .5974) 

(  .8528) 

.10 

0 . 1440 

1.0000* 

0.8570* 

0.9870* 

(  .2156) 

(  .  9738) 

( .5146) 

(  .7960) 

.05 

0.0710 

0 .9990* 

0.7590* 

0.9550* 

( . 1172) 

{ . 9492) 

( .3872) 

( . 6882) 

.01 

0.0220 

0 . 9940* 

0.5710* 

0.8690* 

(.0294) 

(  .8484) 

( .1932) 

(  .4536) 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9680* 

0.4710 

0.5350* 

0.2630 

(  . 9559) 

( .5084) 

( .5138) 

( .2670) 

.15 

0.9600* 

0.4040 

0.4680* 

0.2080 

( . 9452) 

( .4402) 

( .4596) 

( .2150) 

.10 

0.9530* 

0.3250 

0.4080* 

0 .1570* 

(  . 9298) 

(.3618) 

(  .3866) 

( .1494) 

.05 

0.9190* 

0.2150 

0.2970 

0.0880* 

( . 9000) 

( .2566) 

( .3004) 

( .0876) 

.01 

0.8210* 

0.0850 

0.1620 

0.0200 

(.8385) 

( .1196) 

( .1700) 

( .0244) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distribution  with  3  d.f 
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Table  40.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  : ) 


i.  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

20 

0.3280 

1.0000 

0.9410 

0 .9970 

15 

0.2340 

1.0000 

0.9230 

0 .9970 

10 

0 . 1570 

1.0000 

0 . 8790 

0.9940 

05 

0.0750 

1.0000 

0.8180 

0 . 9890 

01 

0.0130 

0.9960 

0.5540 

0 . 9130 

Sign,  level  Cauchy  D.E  t ( 3 )  Logistic 


.20 

0.9840 

0.5210 

0.5960 

0.2740 

.15 

0.9820 

0.4560 

0 . 5440 

0.2200 

.10 

0.9590 

0.3530 

0.4570 

0.1540 

.05 

0.9420 

0.2390 

0.3390 

0.0880 

.01 

0.8370 

0.0650 

0.1600 

0.0120 

Chi (k) 

=  Chi  square  with  k  d.f 

Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distribution  with  3  d.f 
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Table  41.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  35 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0.3340 

1.0000 

0 . 9670 

1.0000 

.15 

0.2650 

1.0000 

0.9550 

0 .9990 

.10 

0 . 1630 

1.0000 

0  .  9270 

0 . 9960 

.05 

0.0900 

1.0000 

0 .8740 

0.9910 

.01 

0.0140 

0.9990 

0.7180 

0.9510 

Sign,  level 

Cauchy 

D.E 

KUBI 

Logistic 

.20 

0.9920 

0.5640 

0 . 6460 

0.2810 

.15 

0.9920 

0.4950 

0.5960 

0.2310 

.10 

0.9840 

0.3910 

0.4990 

0 .1450 

.05 

0.9690 

0.2600 

0.3960 

0.0870 

.01 

0. 917u 

0.0810 

0.2190 

0.0170 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 
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Table 


42.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better) 
(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0 . 3770 
( . 5284) 

1 .0000* 

( . 9896) 

0.9880* 

( .8340) 

1 .0000* 

( . 9828) 

.15 

0.2850 
( . 4482) 

1.0000* 

(  .  9844) 

0 . 9740* 

( .7910) 

1 .0000* 

( . 9752) 

.10 

0.2290 

(.3424) 

1.0000* 

( .9726) 

C .9680* 

( .7248) 

1  .0000* 

(  .9556) 

.05 

0.0890 
( . 1978) 

1.0000* 

( . 9490) 

0.9050* 

( .6036) 

0 .9990* 

( . 9074) 

.01 

0.0250 
( .0454) 

1.0000* 

( .8570) 

0.7570* 

( .3548) 

0 .9900* 

( .7204) 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

0.9970* 

0.6110 

0.6650* 

0.2810 

( . 9918) 

( .  6376) 

(  . 6324) 

( .2958) 

.15 

0.9960* 

0.5300 

0.5990* 

0.2180 

(  . 9888) 

( . 5858) 

(  .5892) 

( .2450) 

.10 

0.9950* 

0.4630 

0.5590* 

0 .1810* 

(  . 9862) 

(.5114) 

(  .5132) 

( .1798) 

.05 

0.9840* 

0.2870 

0.4200* 

0.0850 

( . 9766) 

( .3852) 

( .4140) 

( . 1044) 

.01 

0.9560* 

0.1090 

0.2610* 

0.0250 

( . 9498) 

(  .1820) 

( .2482) 

( .0312) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 


D.E  =  Double  exponential 
t ( 3 )  =  t-distr ibution  with  3  d.f 
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Table  43.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  : ) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.4360 

1.0000 

0.9890 

1.0000 

.  15 

0 . 3440 

1.0000 

0 . 9810 

1.0000 

.10 

0.2280 

1.0000 

0.9680 

0 . 9990 

.05 

0 . 1410 

1.0000 

0 . 9370 

0.9990 

.01 

0.0220 

1.0000 

0.8100 

0.9920 

Sign,  level 

Cauchy 

D.E 

msm 

Logistic 

.20 

0.9990 

0.6590 

0.7130 

0.3210 

.15 

0.9970 

0.5640 

0.6540 

0.2410 

.  10 

0.9960 

0.4730 

0.5770 

0.1760 

.05 

0 . 9900 

0.3480 

0.4500 

0.0900 

.01 

0.9630 

0.1030 

0.2350 

0.0180 

Chi ( k)  =  Chi 

square  with  k  d.f 

Expon  =  Negative  Exponential 

D.E  =  Double  exponential 

t(3)  =  t-distribution  with  3  d.f 
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Table  44.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  =  50 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0.4630 

1.0000 

0.9900 

1.0000 

.  15 

0 . 3860 

1.0000 

0.9850 

1 .0000 

.10 

0.2800 

1.0000 

0.9720 

1.0000 

.05 

0.1510 

1.0000 

0.9580 

1.0000 

.01 

0.0390 

1.0000 

0.8960 

0 .9980 

Sign,  level 

Cauchy 

D.E 

Logistic 

.20 

0.9990 

0.7140 

0 . 7270 

0.3390 

.  15 

0 . 9990 

0.6510 

0.6810 

0.2720 

.  10 

0 . 9990 

0.5420 

0.5940 

0.1900 

.05 

0.9970 

0.3990 

0.4890 

0.0970 

.01 

0 . 9820 

0 .1710 

0.3280 

0 .0270 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distribution  with  3  d.f 
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Table  45.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Normal  against  one  of  the  following  :) 


Sign,  level 

U lif orm 

Chi  (1) 

Chi (4) 

Expon . 

.20 

0 . 5140 

1.0000 

0.9970 

1.0000 

.  15 

0. 4340 

1.0000 

0.9970 

1.0000 

.10 

0.3090 

1.0000 

0.9940 

1.0000 

.05 

0 . 1930 

1.0000 

0.9840 

1.0000 

.01 

0.0430 

1.0000 

0.9280 

0 . 9980 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

1.0000 

0.7450 

0.7480 

0.3510 

.15 

0.9990 

0.6850 

0.7080 

0.2930 

.  10 

0.9990 

0.5720 

0.6300 

0.2060 

.05 

0.9990 

0.4550 

0.5480 

0.1210 

.01 

0 . 9910 

0.1910 

0.3620 

0.0260 

Chi(k)  =  Chi  square  with  k  d.f 
Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t(3)  =  t-distr ibution  with  3  d.f 
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Table  46.  Power  of  Tests  for  Normal  Distriution  with  Sample  Size  = 

(Using  AD) 

(Power  from  K-S  test  in  brackets  and  with  *  when  better  or  the  same) 
(Normal  against  one  of  the  following  :) 


Sign,  level 

Uniform 

Chi  (1) 

Chi  (4) 

Expon . 

.20 

0 . 5900 

1.0000* 

1.0000* 

1.0000* 

( . 6800) 

(1.000) 

( . 9348) 

(  . 9994) 

.15 

0 .4650 

1 .0000* 

0.9990* 

1 .0000* 

( . 6012) 

(1.000) 

( . 9132) 

(  .9984) 

.10 

0.3380 

1 . 0000* 

0 . 9970* 

1.0000* 

( .4918) 

(1.000) 

( .8640) 

(  .9960) 

.05 

0.1980 

1.0000* 

0 .9940* 

1 .0000* 

(.3038) 

(1.000) 

{ .7648) 

(  .9838) 

.01 

0.0610 

1.0000* 

0 . 9640* 

1.0000* 

(.0952) 

(  .9998) 

( .5392) 

(  .9312) 

Sign,  level 

Cauchy 

D.E 

t  (3) 

Logistic 

.20 

1 . 0000* 

0.7710* 

0.7810* 

0.3780* 

(  . 9994) 

(  .7536) 

(  . 7512) 

(  .3306) 

.15 

1.0000* 

0.7010 

0 . 7400* 

0.2940* 

(  . 9990) 

(.7036) 

(.7024) 

(.2736) 

.  10 

1.0000* 

0.6130 

0.6700* 

0 .2130* 

( . 9986) 

(  .  6264) 

( . 6356) 

(  .1990) 

.05 

1.0000* 

0.4660 

0.5780* 

0.1090 

(.9970) 

(.4816) 

(.5282) 

( .1130) 

.01 

0 . 9940* 

0 . 2290 

0.3800* 

0 . 0270 

( . 9926) 

(  .2664) 

( .3632) 

(.0342) 

Chi(k)  =  Chi  square  with  k  d.f 


Expon  =  Negative  Exponential 
D.E  =  Double  exponential 
t  ( 3 )  =  t-distr ibution  with  3  d.f 
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Thus,  this  application  defines  a  new  modified  goodness  of  fit  test  based  on  the 
nonparametric  kernel  density  estimator.  Both  the  CvM  and  AD  statistics  are  used. 
The  critical  values  are  derived  by  Monte  Carlo  experiment.  Then  the  power  of  tin- 
test  for  the  case  of  the  CvM  and  the  AD  statistics  is  obtained  when  the  under Iving 
distribution  is  normal.  This  power  show's  a  value  which  is  close  to  the  signilVaiwv 
level.  The  test  is  then  performed  against  each  of  the  eight  different  alternatives. 
The  power  for  the  different  distributions  using  the  CvM  statistic  shows  an  jncresing 
power  w'ith  sample  size.  The  test  discriminates  all  other  distributions  with  high 
powers,  however  it  does  not  do  as  w'ell  for  the  double  exponential  and  the  logisth 
distribution.  The  modified  test  using  the  AD  statistic  gives  better  power  than  the 
test  based  on  the  CvM  statistic  for  the  different  alternatives  except  for  the  uniform 
distribution  due  to  the  fact  that  the  AD  statistic  is  more  sensitive  to  the  tails  of  the 
distributions  than  the  CvM.  The  results  from  the  power  of  the  test  using  AD  st  atist  i< 
are  compared  to  those  of  the  classical  K-S  test  for  sample  sizes  10.  25.  -10.  and  0(1. 
The  power  from  the  new'  modified  test  using  the  AD  statistic  show's  an  improvement 
over  the  classical  K-S  test  in  all  cases  except  for  the  uniform  distribution. 
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VIII.  Adaptive  Nonparametric  Kernel  Density  Estimation 


Application 

Introduction 

In  this  chapter  an  "adaptive”  approach  for  the  density  estimation  is  introduced. 
This  approach  is  based  on  a  given  criteria  according  to  which  a  suitable  or  near 
optimal  adaptive  choice  of  the  window  width  is  to  be  used  for  the  kernel  fit. 

The  general  strategy  for  this  application  is  to  generate  different  samples  from 
various  distributions.  For  each  sample  a  criteria  to  classify  or  discriminate  the  parent 
distribution  from  which  the  sample  is  drawn  is  computed.  Based  on  the  criteria,  a 
suitable  choice  of  the  window  width  for  each  case  is  found.  The  chosen  h  value  is 
considered  an  adaptive  choice  in  this  case  since  it  varies  with  the  computed  sample 
criteria.  As  the  adaptive  choice  for  the  h  parameter  is  found  a  nonparametric  kernel 
estimator  for  the  underlying  density  will  be  estimated. 

Percentile  Ratios 

For  the  development  of  this  application,  a  discriminant  was  needed.  The  kur- 
tosis,  Hogg’s  Q  statistic,  and  the  percentile  ratios  are  examples  of  such  discriminants 
that  could  be  used.  Since  both  the  kurtosis  and  the  Q  statistic  average  the  measure 
for  the  upper  and  lower  tail  lengths,  they  are  not  compatible  with  the  asymmetric 
distributions.  The  percentile  ratios  were  chosen  to  be  used  as  a  discriminant  since 
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it  measures  both  tail  lengths  separately.  The  upper  and  lower  tail  lengths  are  mea¬ 
sured  for  the  distribution  by  the  upper  and  lower  percentile  ratios  which  are  defined 
respectively  to  be: 


F-l(.97r>)  -  F~l(.5) 
F-'(JS)  -  F~l{.5) 


(143) 


F~l(. 5)  -  F~1(.025) 
5)  -  F-1  (.25) 


(144) 


where 


F  ’(a)  ....  represents  the  a  percentile  of  the  distribution. 

The  population  percentile  ratios  for  some  distributions  with  scale  parameter 
zero  and  shape  parameter  1  is  given  in  the  following  table 

Table  47.  Values  of  Percentile  ratios  for  Different  Distributions 


Distribution 

P, 

P« 

Uniform 

1.900 

1.900 

Logistic 

3.343 

3.343 

Exponential 

1.647 

4.322 

Double  Exponential 

4.322 

4.322 

Cauchy 

12.706 

12.706 

Normal 

2.904 

2.904 

Beta(l/2,l/2) 

1.409 

1.409 

The  median  rank  is  used  to  find  the  a-  sample  percentiles  based  on  a  sample 
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of  size  GO.  This  gives  the  sample  percentile  ratio  as: 


p» 

Ol  -  O3 

a.,  -  «3 

(145) 

Pi 

«3  -  U 1 

n3  -  a 2 

(146) 

where 


a  l  =  -19A(i)  +  .8lA(2) 

(147) 

a2  =  -6A'(i5)  +  -4  Ar(1G) 

(148) 

03  =  .5A(j)  -f  .5A'(2) 

(149) 

04  =  .399A^(i)  T  .60lA(2) 

(150) 

05  =  .81A(!)  -f  ,19A(2) 

(151) 

where  App.-the  Ith  order  statistic 

To  fit  a  nonparametric  distribution  to  the  given  data  using  the  kernel  estima¬ 
tion,  it  is  required  to  find  the  value  of  the  window  width  h  to  be  used.  A  numerically 
optimal  value  for  various  distributions  for  a  sample  size  20  is  found  in  Chapter  IV. 
The  form  used  for  the  h  value  is: 


/l^p(  —  A  C  S  Tl 


_L 


(152) 
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where  k...is  a  constant  that  varies  from  one  distribution  to  another. 
c...is  an  adjusting  factor  for  the  unbiasedness  of  the  s. 
s...is  the  sample  standard  deviation. 
n...is  the  sample  size. 

For  this  application  a  sample  size  n=60  is  used  for  the  adaptive  method.  The 
form  for  the  optimal  h  is  the  same  as  in  chapter  IV.  The  adjusting  factor  c  is  1.0133. 
The  folllowing  table  gives  the  values  of  k  for  different  distributions. 

Table  48.  Suggested  k  for  the  h  value 


Distribution 

k 

Uniform 

1.0589 

Logistic 

1.4821 

Exponential 

.5334 

Double  Exponential 

.8376 

Cauchy 

.9657 

Normal 

1.1789 

An  Adaptive  Methodology 

A  Monte  Carlo  experiment  of  size  1000  was  performed  on  a  sample  size  60  to 
find  the  average  and  the  standard  deviation  of  the  sample  percentile  ratios  for  some 
distributions.  The  next  table  shows  the  resulting  sample  average  upper  and  lower 
percentile  ratios  with  standard  deviation  given  in  brackets. 
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Table  49.  Avearge  sample  percentile  ratios 


Distribution 

Pi 

mm 

Uniform 

1.9750 

(.3937) 

1.9513 

(.4042) 

Exponential 

1.7167 

(.3172) 

4.2838 

(1.5324) 

Cauchy 

83.9780 

(634.0062) 

13.8451 

(16.5956) 

Double  Exponential 

5.2356 

(2.0799) 

4.1618 

(1.4824) 

Logistic 

3.9424 

(1.3424) 

3.2630 

( 1.0109 ) 

Normal 

3.2831 

(.9372) 

2.8688 

(.7899) 

The  adaptive  nonparametric  density  estimation  application  procedure  started 
by  generating  1000  samples  each  of  size  60  from  the  above  distributions.  The  sample 
percentile  ratios  for  each  sample  are  then  computed.  A  piecewise  linear  relation 
based  on  the  three  two  tuples  (p.k)  from  the  uniform,  normal,  logistic  distributions, 
where  p  and  k  represent  the  percentile  ratios  and  the  constant  defined  earlier  for 
these  distributions  respectively  is  used. 

The  support  is  subdivided  into  three  subsets  S\,  S2  and  S3  such  that  Uj=^Sj  = 
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Table  50.  M1SE  for  the  adaptive  technique  (  with  standard  deviation  in  brackets) 


Distribution 

M  lSEadap 

MISEm 

Uniform 

.08237 

.08267 

(.01891) 

(.02017) 

Logistic 

.03244 

.03904 

(.01552) 

(.01763) 

Normal 

.00870 

.008S3 

(.00621) 

(.00625) 

Tv  and  such  that: 


Si  = 

{x|x  <  F_1(.25)} 

(153) 

S2  = 

{x|F-I(.2o)  <  x  <  F-*(.75)} 

(154) 

S3  = 

{x|x  >  F-1(.75)} 

(155) 

The  h  value  is  chosen 

to  vary  with  each  subset  of  the  support. 

The  h  is 

empircally  chosen  to  b^  a  function  of  the  distribution  tail  length,  in  the  sense  of 
choosing  different  values  of  the  h  for  each  of  the  three  subsets  of  the  support.  This  is 
done  by  interpolating  the  piecewise  relation  for  the  measured  Pi  and  Pu  and  finding 
the  corresponding  k. 

The  results  from  this  chapter  are  shown  in  table  50.  The  table  gives  MISE  for 
this  adaptive  approach  given  as  MISEadapt  and  the  MISE  from  chapter  III  where 
the  estimator  for  the  window  width  was  sn's.  The  table  shows  that  the  adaptive 
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method  is  doing  slightly  better  in  the  case  of  uniform  and  normal  distributions,  while 
for  the  logistic  distribution  the  method  gives  20%  improvement  in  the  MISE  over 
that  of  chapter  III.  This  result  depicts  that  the  adaptive  technique  which  is  applied 
for  different  sample  size  (60)  is  working  with  the  values  of  the  constant  obtained 
from  the  Monte  Carlo  experiment  in  chapter  IV,  and  hence  could  be  used  in  those 
applications  that  require  no  assumption  about  the  distribution  form.  Hence  this 
chapter  gives  another  tool  for  applications  besides  the  ones  discussed  in  the  earlier 
chapters. 
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Appendix  A.  Generation  of  random  deviates 


1.  Cauchy  Distribution 

The  probability  density  function  is  given  by: 

f(x)  =  6/x  |(x  —  a)2  +  &2j  a,x£7Z,0<b<OG 


with 


mode(x)  =  a  ,  median(x)  =  a 


F(x)  =  /  6/ 7r  —  a)2  +  62j  dx 

—  OO 

(x  —  a) 


=  —tan' 

7T 


+  .5 


and  the  generated  deviate  will  be  given  by: 


x  =  btan  (u  —  .5)]  +  a 


Also,  it  could  be  generated  using  the  fact  that  if  (xi,x2)  are  uniformly  distributed 
in  a  circle  centered  at  the  origin  then  Ui/t>2  will  be  Cauchy  distributed. 
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2.  Logistic  Distribution 


/(x)  =  exp  [—  (x  —  a)  /ij  /  [i  (1  +  exp  [—  (x  —  a)  /A])2] 

with  a  C.D.F 


exp[— (x  —  a) /b\ 

with 

E(x)  =  a  .  V(x)  =  ,  mode(x)  =  a 

with  variates  generated  by: 


x 


3.  Weibull  Distribution 

The  3-parameter  Weibull  density  function  is  given  by: 


exp 


,6  <  x.0J>  0 


with  expected  value 


E(x)  =  6  +  0  r 


ill) 

P  J 
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and  with  variance 


!-(,•)  -  0 2 


3  +  2' 


r2 


3+  l 

3 


where  F  denotes  the  gamma  function, 
and  C.D.F 


F(x)  —  1  —  e 


(^) 


(3 


and  the  variates  generated  by: 


.r  =  -exp 


In  (—In  (1  —  R)) 

3 
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kernels  with  finite  mean  and  finite  variance.  The  results  also  treat  kernels  with  varying 
window  parameters. 

The  nonparametric  kernel  estimate  was  used  to  obtain  new  estimators  for  the  three  para¬ 
meter  Weibull  distribution  using  distance  estimation  and  the  Cramervon-Mises  statistic. 
Comparison  with  maximum  likelihood  estimators  using  a  Monte  Carlo  sample  of  size  1000  and 
various  different  parameters  showed  a  significant  improvement  over  the  maximum  liklihood 
estimators  in  the  mean  integrated  square  error  between  the  estimated  distribution  and  the  true 
distribution. 
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